bash 替换包含 CRLF 的字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11393616/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 02:42:46  来源:igfitidea点击:

Replace string that contains CRLF?

regexbashsed

提问by fredley

I'm reformatting a file, and I want to perform the following steps:

我正在重新格式化一个文件,我想执行以下步骤:

  1. Replace double CRLF's with a temporary character sequence ($CRLF$or something)
  2. Remove all CRLF's in the whole file
  3. Go back and replace the double CRLF's.
  1. 用临时字符序列($CRLF$或其他东西)替换双 CRLF
  2. 删除整个文件中的所有 CRLF
  3. 返回并更换双 CRLF。

So input like this:

所以输入如下:

This is a paragraph
of text that has
been manually fitted
into a certain colum
width.

This is another
paragraph of text
that is the same.

Will become

会变成

This is a paragraph of text that has been manually fitted into a certain colum width.

This is another paragraph of text that is the same.

It seems this should be possible by piping the input through a few simple sedprograms, but I'm not sure how to refer to CRLFin sed(to use in sed 's/<CRLF><CRLF>/$CRLF$/'). Or maybe there's a better way of doing this?

似乎这应该可以通过一些简单的sed程序来管道输入,但我不确定如何引用CRLFin sed(使用 in sed 's/<CRLF><CRLF>/$CRLF$/')。或者也许有更好的方法来做到这一点?

采纳答案by LSerni

You can use sed to decorate all rows with a {CRLF} at end:

您可以使用 sed 在末尾用 {CRLF} 装饰所有行:

sed 's/$/<CRLF>/'

then remove all \r\n with tr

然后用 tr 删除所有 \r\n

| tr -d "\r\n"

and then replace double CRLF's with \n

然后用 \n 替换双 CRLF

| sed 's/<CRLF><CRLF>/\n/g'

and remove leftover CRLF's.

并删除剩余的 CRLF。

There was an one-liner sed which did all this in a single cycle, but I can't seem to find it now.

有一个单行 sed 在一个周期内完成所有这些,但我现在似乎找不到它。

回答by me_and

Try the below:

试试下面的:

cat file.txt | sed 's/$/ /;s/^ *$/CRLF/' | tr -d '\r\n' | sed 's/CRLF/\r\n'/

That's not quite the method you've given; what this does is the below:

这不是你给出的方法;这是做什么的:

  1. Add a space to the end of each line.
  2. Replace any line that contains only whitespace (ie blank lines) with "CRLF".
  3. Deletes any line-breaking characters (both CR and LF).
  4. Replaces any occurrences of the string "CRLF" with a Windows-style line break.
  1. 在每行末尾添加一个空格。
  2. 用“CRLF”替换任何只包含空格(即空行)的行。
  3. 删除任何换行字符(CR 和 LF)。
  4. 用 Windows 样式的换行符替换任何出现的字符串“CRLF”。

This works on Cygwin bash for me.

这对我来说适用于 Cygwin bash。

回答by Todd A. Jacobs

Redefine the Problem

重新定义问题

It looks like what you're reallytrying to do is reflow your paragraphs and single-space your lines. There are a number of ways you can do this.

看起来您真正想做的是重排段落和单行距。有多种方法可以做到这一点。

A Non-Sed Solution

非 Sed 解决方案

If you don't mind using some packages outside coreutils, you could use some additional shell utilities to make this as easy as:

如果你不介意在 coreutils 之外使用一些包,你可以使用一些额外的 shell 实用程序来简化它:

dos2unix /tmp/foo
fmt -w0 /tmp/foo | cat --squeeze-blank | sponge /tmp/foo
unix2dos /tmp/foo

Sponge is from the moreutilspackage, and will allow you to write the same file you're reading. The dos2unix(or alternatively the tofrodos) package will allow to convert your line endings back and forth for easier integration with tools that expect Unix-style line endings.

Sponge 来自moreutils包,它允许您编写您正在阅读的相同文件。该DOS2UNIX的(或可替代的tofrodos)包将允许你行尾来回转换为更易于集成与期待Unix风格的行尾工具。

回答by potong

This might work for you (GNU sed):

这可能对你有用(GNU sed):

sed ':a;$!{N;/\n$/{p;d};s/\r\?\n/ /;ba}' file

回答by Drew Deal

Am I missing why this is not easier?

我错过了为什么这并不容易?

Add CRLF:

添加CRLF:

sed -e s/\s+$/$'\r\n'/ < index.html > index_CRLF.html

sed -es/\s+$/$'\r\n'/ < index.html > index_CRLF.html

remove CRLF... go unix:

删除 CRLF...转到 unix:

sed -e s/\s+$/$'\n'/ < index_CRLF.html > index.html

sed -es/\s+$/$'\n'/ < index_CRLF.html > index.html