bash 使用bash脚本将文本文件一分为二
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3644238/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
split text file in two using bash script
提问by Sergey Kovalev
I have a text file with a marker somewhere in the middle:
我有一个文本文件,中间有一个标记:
one
two
three
blah-blah *MARKER* blah-blah
four
five
six
...
I just need to split this file in two files, first containing everything before MARKER, and second one containing everything after MARKER. It seems it can be done in one line with awk or sed, I just can't figure out how.
我只需要将此文件拆分为两个文件,第一个包含MARKER之前的所有内容,第二个包含MARKER之后的所有内容。似乎可以用 awk 或 sed 在一行中完成,我只是不知道如何。
I tried the easy way — using csplit, but csplit doesn't play well with Unicode text.
我尝试了简单的方法 — 使用csplit,但 csplit 不能很好地处理 Unicode 文本。
回答by ghostdog74
you can do it easily with awk
你可以用 awk 轻松完成
awk -vRS="MARKER" '{print awk '/MARKER/{n++}{print >"out" n ".txt" }' final.txt
>NR".txt"}' file
回答by Leniel Maccaferri
Try this:
尝试这个:
sed -n '/MARKER/q;p' inputfile > outputfile1
sed -n '/MARKER/{:a;n;p;ba}' inputfile > outputfile2
It will read input from final.txt and produces out1.txt, out2.txt, etc...
它将从 final.txt 读取输入并生成 out1.txt、out2.txt 等...
回答by Paused until further notice.
sed -n -e '/MARKER/! w outputfile1' -e'/MARKER/{:a;n;w outputfile2' -e 'ba}' inputfile
Or all in one:
或合二为一:
$ split -p '\*MARKER\*' splitee
$ cat xaa
one
two
three
$ cat xab
blah-blah *MARKER* blah-blah
four
five
six
$ tail -n+2 xab
four
five
six
回答by Marcelo Cantos
The splitcommand will almost do what you want:
该split命令几乎可以执行您想要的操作:
Perhaps it's close enough for your needs.
也许它足以满足您的需求。
I have no idea if it does any better with Unicode than csplit, though.
不过,我不知道使用 Unicode 是否比 csplit 更好。

