string 将大字符串拆分为子字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7568112/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:13:32  来源:igfitidea点击:

Split large string into substrings

stringbashshell

提问by didymos

I have a huge string like:

我有一个巨大的字符串,如:

ABCDEFGHIJKLM...

ABCDEFGHIJKLM...

and I would like to split it into substrings of length 5 in this way:

我想以这种方式将其拆分为长度为 5 的子字符串:

>1
ABCDE
>2
BCDEF
>3
CDEFG

[...]

>1
ABCDE
>2
BCDEF
>3
CDEFG

[...]

UPDATE

更新

Solution:
ok, thanks to you guys I was able to find way to do this fast!. This is my sollution combining few ideas from here:

解决方案:
好的,多亏了你们,我才能找到快速做到这一点的方法!。这是我的解决方案,结合了这里的一些想法:

str="ABCDEFGHIJKLMNOP"
splitfive(){ echo $1 | cut -c $2- |sed -r 's/(.{5})/\1\n/g' ; }
for (( i=0 ; i <= 5 ; i++ )) ; do splitfive "$str" $i ; done | grep -v "^$"

str="ABCDEFGHIJKLMNOP"
splitfive(){ echo $1 | cut -c $2- |sed -r 's/(.{5})/\1\n/g' ; }
for (( i=0 ; i <= 5 ; i++ )) ; 做 splitfive "$str" $i ; 完成 | grep -v "^$"

回答by chown

${string:position:length}

Extracts $length characters of substring from $string at $position.

从 $position 处的 $string 中提取子字符串的 $length 个字符。

stringZ=abcABC123ABCabc
#       0123456789.....
#       0-based indexing.

echo ${stringZ:0}                            # abcABC123ABCabc
echo ${stringZ:1}                            # bcABC123ABCabc
echo ${stringZ:7}                            # 23ABCabc

echo ${stringZ:0:5}                          # abcAB
                                             # Five characters of substring.

Then use a loop to go through and add 1 to the position to extract each substring of length 5.

然后用一个循环遍历并在位置上加1来提取每个长度为5的子串。

for i in seq 0 ${#stringZ}; do
    echo ${stringZ:$i:5}
done

All from Bash string manipulation

全部来自Bash 字符串操作

回答by Kent

sed can do it in one shot:

sed 可以一次性完成:

kent$  echo "abcdefghijklmnopqr"|sed -r 's/(.{5})/ /g'
abcde fghij klmno pqr

or

或者

depends on your needs:

取决于您的需求:

kent$  echo "abcdefghijklmnopqr"|sed -r 's/(.{5})/\n/g' 
abcde
fghij
klmno
pqr

update

更新

i thought it was just simply split string problem, didn't read the question very carefully. Now it should give what you need:

我以为这只是简单的拆分字符串问题,没有仔细阅读问题。现在它应该提供你需要的东西:

still one shot, but with awk this time:

还是一枪,但这次用awk:

kent$  echo "abcdefghijklmnopqr"|awk '{while(length(
$ echo "ABCDEFGHIJKLMNOPQRSTUVWXYZ" | fold -w5
ABCDE
FGHIJ
KLMNO
PQRST
UVWXY
Z
)>=5){print substr(
s=ABCDEFGHIJ
for (( i=0; i < ${#s}-4; i++ )); do 
  printf ">%d\n%s\n" $((i+1)) ${s:$i:5}
done
,1,5);gsub(/^./,"")}}' abcde bcdef cdefg defgh efghi fghij ghijk hijkl ijklm jklmn klmno lmnop mnopq nopqr

回答by Zack

fold -w5should do the trick.

fold -w5应该做的伎俩。

>1
ABCDE
>2
BCDEF
>3
CDEFG
>4
DEFGH
>5
EFGHI
>6
FGHIJ

Cheers!

干杯!

回答by glenn Hymanman

In bash:

在 bash 中:

 sed -nr ':a;h;s/(.{5}).*//p;g;s/.//;ta;' <<<"ABCDEFGHIJKLM" | # split string
     sed '=' | sed '1~2s/^/>/' # add line numbers and insert '>'

outputs

产出

$ sed 's/\(.....\)/\n/g' < filecontaininghugestring

回答by potong

sed can do it:

sed 可以做到:

$ ls

$ echo "abcdefghijklmnopqr" | split -b5

$ ls
xaa  xab  xac  xad

$ cat xaa
abcde

回答by holygeek

Would sed do it?:

sed 会这样做吗?:

str=ABCDEFGHIJKLM
splitfive(){ echo "${1::5}" ; }
for (( i=0 ; i < ${#str} ; i++ )) ; do splitfive "$str" $i ; done

回答by Fredrik Pihl

...or use the splitcommand:

...或使用split命令:

#!/usr/bin/env bash

splitstr(){
    printf '%s\n' "${1::}"
}

n=
offset=

declare -a by_fives

while IFS= read -r str ; do
    for (( i=0 ; i < ${#str} ; i++ )) ; do
            by_fives=("${by_fives[@]}" "$(splitstr "$str" $i $n)")
    done
done

echo ${by_fives[$offset]}

splitalso operates on files...

split还对文件进行操作...

回答by sorpigal

$ split-by 5 2 <<<"ABCDEFGHIJKLM"
CDEFG

Or, perhaps you want to do something more intelligent with the results

或者,也许您想对结果做一些更智能的事情

#include <stdio.h>

int main(void){
    FILE* f;
    int n=0;
    char five[6];

    five[5] = '
echo "ABCDEFGHIJKLMNOP" | cut --output-delimiter=$'\n' -c1-5,6-10,11-15
'; f = fopen("inputfile", "r"); if(f!=0){ fread(&five, sizeof(char), 5, f); while(!feof(f)){ printf("%s\n", five); fseek(f, ++n, SEEK_SET); fread(&five, sizeof(char), 5, f); } } return 0; }

And then call it

然后调用它

ABCDE
FGHIJ
KLMNO

You can adapt it from there.

你可以从那里调整它。

EDIT: trivial version in C, for performance comparison:

编辑:C 中的普通版本,用于性能比较:

echo "ABCDEFGHIJKLMNOP" | cut --output-delimiter=$':' -c1-5,6-10,11-15 

Forgive my bad C, I really don't knw the language.

原谅我糟糕的 C,我真的不知道这门语言。

回答by stefanB

You could use cutand specify charactersinstead of fields, and then change output delimiter to whatever you need, like new line:

您可以使用cut和指定characters代替fields,然后将输出分隔符更改为您需要的任何内容,例如 new line

ABCDE:FGHIJ:KLMNO

output

输出

##代码##

or

或者

##代码##

output

输出

##代码##