bash 使用shell脚本合并txt文件的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19061509/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
merge lines of a txt file using shell script
提问by arunmoezhi
I invoke a program from shell script and it creates an output file with this format:
我从 shell 脚本调用了一个程序,它创建了一个具有以下格式的输出文件:
aaaaa\
bbbbb\
ccccc\
I would like to change this to:
我想将其更改为:
aaaaabbbbbccccc
In VI editor I can just do ggVGJ
and then replace all \ with "".
But I want to get this done via a script.
在 VI 编辑器中,我可以这样做ggVGJ
,然后将所有 \ 替换为“”。但我想通过脚本来完成这项工作。
回答by Steve
Here's one way using GNU sed
:
这是使用 GNU 的一种方法sed
:
sed ':a; N; $!ba; s/\\n//g; s/\$//' file
Another way using awk
, may give you better performance:
使用awk
, 的另一种方法可能会给您带来更好的性能:
awk '{ sub ("\\$", ""); printf "%s", aaaaabbbbbccccc
} END { print "" }' file
Results:
结果:
sed 's/\$//' < sample.txt | tr -d '\n'
Explanation:
解释:
The awk
solution removes the trailing backslash (via substitution) and printf's each line (without a newline character). END
(which is executed at the end of the script) then prints a newline character. This is superior to the sed
solution, which creates a label called a
and appends the next line of input into the pattern space. $!ba
means 'if not at the last line of input, branch to label a
'. The first substitution then removes each backslash and newline character from the pattern space. The second substitution removes the last, trailing backslash. This solution should be fast for small files, but probably won't be any faster than the awk
for the same file. Although ... it was faster to write.
该awk
解决方案删除了尾随反斜杠(通过替换)和 printf 的每一行(没有换行符)。END
(在脚本末尾执行)然后打印一个换行符。这优于sed
解决方案,后者创建一个名为的标签a
并将输入的下一行附加到模式空间中。$!ba
意思是'如果不是在输入的最后一行,分支到标签a
'。第一个替换然后从模式空间中删除每个反斜杠和换行符。第二个替换删除最后一个尾随反斜杠。此解决方案对于小文件应该很快,但可能不会比awk
相同文件快。虽然......写起来更快。
回答by janos
Here's one way using sed
and tr
:
这是使用sed
and的一种方法tr
:
sed 's/\$//' < sample.txt | tr -d '\n'; echo
If you want to add a newline too, you can add an echo
at the end:
如果您也想添加换行符,可以echo
在末尾添加:
{ sed 's/\$//' < sample.txt | tr -d '\n'; echo; }
If you want the whole thing to be a one unit, for example to use in a ... && ... || ...
construct then you can group the two steps like this:
如果您希望整个事物成为一个单元,例如在... && ... || ...
构造中使用,那么您可以像这样将两个步骤分组:
$ cat file.txt
aaaaa\
bbbbb\
ccccc\
$ { cat file.txt ; echo; } | while read line; do echo $line; done
aaaaabbbbbccccc
$
回答by Digital Trauma
Another way, using pure bash:
另一种方式,使用纯 bash:
$ cat tmp.txt
aaaaa\
bbbbb\
ccccc\
$ cat tmp.txt | tr -d "\\r\n"
aaaaabbbbbccccc
This works because the bash read
command actually deals with the \ continuation automatically (use the -r switch to read
to disable this behavior). The echo
after the cat
is necessary for this example because the last line of your sample text ends in \
, so the read command doesn't think it has got to the end of a line and doesn't output anything. The echo
just inserts an empty line at the end of the stream to clean this up.
这是有效的,因为 bashread
命令实际上会自动处理 \ 延续(使用 -r 开关read
来禁用此行为)。此示例必须使用echo
after ,cat
因为示例文本的最后一行以 结尾\
,因此 read 命令不会认为它已到达行尾并且不输出任何内容。在echo
刚刚插入一个空行处流的末尾清理它。
回答by Cyber Oliveira
I guess this solution is the smallest:
我想这个解决方案是最小的:
$ cat file.txt
aaaaa\
bbbbb\
ccccc\
$ cat file.txt | gcc -xc -E -P -w - | grep .
aaaaabbbbbccccc
$
回答by Digital Trauma
This is a reallyugly hack, but you could use the gcc preprocessor:
这是一个非常丑陋的黑客,但您可以使用gcc 预处理器:
awk -F'\\$' '{printf "%s", }END{print ""}' file
Why is this risky? If your input text happened to contain preprocessor directives, then they would get interpreted, resulting in a mess.
为什么这有风险?如果您的输入文本碰巧包含预处理器指令,那么它们将被解释,导致混乱。
回答by Kent
try this line;
试试这条线;
sed 's/\$//g' file | awk '{printf "%s", }'
回答by iamauser
One with awk
and sed
:
一个awk
和sed
:
sed
command removes the slash at the end of the line. $
denotes the end of the line after a slash. Since slash
is considered as a meta character in sed
, you need an extra \
to escape it. piping the output of sed to awk printf
prints multiple lines in one. $0
represents the entire line.
sed
命令删除行尾的斜杠。$
表示斜线后的行尾。由于slash
被视为 中的元字符sed
,因此您需要额外的字符\
来转义它。将 sed 的输出管道化为awk printf
一行打印多行。$0
代表整条线。