string 将大字符串拆分为子字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7568112/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Split large string into substrings
提问by didymos
I have a huge string like:
我有一个巨大的字符串,如:
ABCDEFGHIJKLM...
ABCDEFGHIJKLM...
and I would like to split it into substrings of length 5 in this way:
我想以这种方式将其拆分为长度为 5 的子字符串:
>1
ABCDE
>2
BCDEF
>3
CDEFG[...]
>1
ABCDE
>2
BCDEF
>3
CDEFG[...]
UPDATE
更新
Solution:
ok, thanks to you guys I was able to find way to do this fast!. This is my sollution combining few ideas from here:
解决方案:
好的,多亏了你们,我才能找到快速做到这一点的方法!。这是我的解决方案,结合了这里的一些想法:
str="ABCDEFGHIJKLMNOP"
splitfive(){ echo $1 | cut -c $2- |sed -r 's/(.{5})/\1\n/g' ; }
for (( i=0 ; i <= 5 ; i++ )) ; do splitfive "$str" $i ; done | grep -v "^$"
str="ABCDEFGHIJKLMNOP"
splitfive(){ echo $1 | cut -c $2- |sed -r 's/(.{5})/\1\n/g' ; }
for (( i=0 ; i <= 5 ; i++ )) ; 做 splitfive "$str" $i ; 完成 | grep -v "^$"
回答by chown
${string:position:length}
Extracts $length characters of substring from $string at $position.
从 $position 处的 $string 中提取子字符串的 $length 个字符。
stringZ=abcABC123ABCabc
# 0123456789.....
# 0-based indexing.
echo ${stringZ:0} # abcABC123ABCabc
echo ${stringZ:1} # bcABC123ABCabc
echo ${stringZ:7} # 23ABCabc
echo ${stringZ:0:5} # abcAB
# Five characters of substring.
Then use a loop to go through and add 1 to the position to extract each substring of length 5.
然后用一个循环遍历并在位置上加1来提取每个长度为5的子串。
for i in seq 0 ${#stringZ}; do
echo ${stringZ:$i:5}
done
All from Bash string manipulation
全部来自Bash 字符串操作
回答by Kent
sed can do it in one shot:
sed 可以一次性完成:
kent$ echo "abcdefghijklmnopqr"|sed -r 's/(.{5})/ /g'
abcde fghij klmno pqr
or
或者
depends on your needs:
取决于您的需求:
kent$ echo "abcdefghijklmnopqr"|sed -r 's/(.{5})/\n/g'
abcde
fghij
klmno
pqr
update
更新
i thought it was just simply split string problem, didn't read the question very carefully. Now it should give what you need:
我以为这只是简单的拆分字符串问题,没有仔细阅读问题。现在它应该提供你需要的东西:
still one shot, but with awk this time:
还是一枪,但这次用awk:
kent$ echo "abcdefghijklmnopqr"|awk '{while(length($ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZ" | fold -w5
ABCDE
FGHIJ
KLMNO
PQRST
UVWXY
Z
)>=5){print substr(s=ABCDEFGHIJ
for (( i=0; i < ${#s}-4; i++ )); do
printf ">%d\n%s\n" $((i+1)) ${s:$i:5}
done
,1,5);gsub(/^./,"")}}'
abcde
bcdef
cdefg
defgh
efghi
fghij
ghijk
hijkl
ijklm
jklmn
klmno
lmnop
mnopq
nopqr
回答by Zack
fold -w5
should do the trick.
fold -w5
应该做的伎俩。
>1
ABCDE
>2
BCDEF
>3
CDEFG
>4
DEFGH
>5
EFGHI
>6
FGHIJ
Cheers!
干杯!
回答by glenn Hymanman
In bash:
在 bash 中:
sed -nr ':a;h;s/(.{5}).*//p;g;s/.//;ta;' <<<"ABCDEFGHIJKLM" | # split string
sed '=' | sed '1~2s/^/>/' # add line numbers and insert '>'
outputs
产出
$ sed 's/\(.....\)/\n/g' < filecontaininghugestring
回答by potong
sed can do it:
sed 可以做到:
$ ls
$ echo "abcdefghijklmnopqr" | split -b5
$ ls
xaa xab xac xad
$ cat xaa
abcde
回答by holygeek
Would sed do it?:
sed 会这样做吗?:
str=ABCDEFGHIJKLM
splitfive(){ echo "${1::5}" ; }
for (( i=0 ; i < ${#str} ; i++ )) ; do splitfive "$str" $i ; done
回答by Fredrik Pihl
...or use the split
command:
...或使用split
命令:
#!/usr/bin/env bash
splitstr(){
printf '%s\n' "${1::}"
}
n=
offset=
declare -a by_fives
while IFS= read -r str ; do
for (( i=0 ; i < ${#str} ; i++ )) ; do
by_fives=("${by_fives[@]}" "$(splitstr "$str" $i $n)")
done
done
echo ${by_fives[$offset]}
split
also operates on files...
split
还对文件进行操作...
回答by sorpigal
$ split-by 5 2 <<<"ABCDEFGHIJKLM"
CDEFG
Or, perhaps you want to do something more intelligent with the results
或者,也许您想对结果做一些更智能的事情
#include <stdio.h>
int main(void){
FILE* f;
int n=0;
char five[6];
five[5] = 'echo "ABCDEFGHIJKLMNOP" | cut --output-delimiter=$'\n' -c1-5,6-10,11-15
';
f = fopen("inputfile", "r");
if(f!=0){
fread(&five, sizeof(char), 5, f);
while(!feof(f)){
printf("%s\n", five);
fseek(f, ++n, SEEK_SET);
fread(&five, sizeof(char), 5, f);
}
}
return 0;
}
And then call it
然后调用它
ABCDE
FGHIJ
KLMNO
You can adapt it from there.
你可以从那里调整它。
EDIT: trivial version in C, for performance comparison:
编辑:C 中的普通版本,用于性能比较:
echo "ABCDEFGHIJKLMNOP" | cut --output-delimiter=$':' -c1-5,6-10,11-15
Forgive my bad C, I really don't knw the language.
原谅我糟糕的 C,我真的不知道这门语言。
回答by stefanB
You could use cut
and specify characters
instead of fields
, and then change output delimiter to whatever you need, like new line:
您可以使用cut
和指定characters
代替fields
,然后将输出分隔符更改为您需要的任何内容,例如 new line:
ABCDE:FGHIJ:KLMNO
output
输出
##代码##or
或者
##代码##output
输出
##代码##