Linux 如何在 Bash 脚本中将 DOS/Windows 换行符 (CRLF) 转换为 Unix 换行符 (LF)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2613800/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 19:56:08  来源:igfitidea点击:

How to convert DOS/Windows newline (CRLF) to Unix newline (LF) in a Bash script?

linuxwindowsbashunixnewline

提问by Koran Molovik

How can I programmatically (i.e., not using vi) convert DOS/Windows newlines to Unix?

如何以编程方式(即不使用vi)将 DOS/Windows 换行符转换为 Unix?

The dos2unixand unix2doscommands are not available on certain systems. How can I emulate these with commands like sed/awk/tr?

dos2unixunix2dos命令不可用在某些系统上。如何使用sed/ awk/ 之类的命令模拟这些tr

回答by codaddict

Using AWK you can do:

使用 AWK,您可以:

awk '{ sub("\r$", ""); print }' dos.txt > unix.txt

Using Perl you can do:

使用 Perl,您可以执行以下操作:

perl -pe 's/\r$//' < dos.txt > unix.txt

回答by Jonathan Leffler

You can use trto convert from DOS to Unix; however, you can only do this safely if CR appears in your file only as the first byte of a CRLF byte pair. This is usually the case. You then use:

您可以使用tr从 DOS 到 Unix 的转换;但是,只有当 CR 仅作为 CRLF 字节对的第一个字节出现在您的文件中时,您才能安全地执行此操作。通常是这种情况。然后你使用:

tr -d '5' <DOS-file >UNIX-file

Note that the name DOS-fileis different from the name UNIX-file; if you try to use the same name twice, you will end up with no data in the file.

请注意,名称与名称DOS-file不同UNIX-file;如果您尝试两次使用相同的名称,则最终文件中将没有数据。

You can't do it the other way round (with standard 'tr').

你不能反过来做(使用标准的“tr”)。

If you know how to enter carriage return into a script (control-V, control-Mto enter control-M), then:

如果您知道如何在脚本中输入回车符 ( control-V,control-M以输入 control-M),则:

sed 's/^M$//'     # DOS to Unix
sed 's/$/^M/'     # Unix to DOS

where the '^M' is the control-M character. You can also use the bashANSI-C Quotingmechanism to specify the carriage return:

其中 '^M' 是 control-M 字符。您还可以使用bashANSI-C 引用机制来指定回车:

sed $'s/\r$//'     # DOS to Unix
sed $'s/$/\r/'     # Unix to DOS

However, if you're going to have to do this very often (more than once, roughly speaking), it is far more sensible to install the conversion programs (e.g. dos2unixand unix2dos, or perhaps dtouand utod) and use them.

但是,如果您必须经常这样做(粗略地说不止一次),安装转换程序(例如dos2unixand unix2dos,或者也许dtouand utod)并使用它们要明智得多。

If you need to process entire directories and subdirectories, you can use zip:

如果需要处理整个目录和子目录,可以使用zip

zip -r -ll zipfile.zip somedir/
unzip zipfile.zip

This will create a zip archive with line endings changed from CRLF to CR. unzipwill then put the converted files back in place (and ask you file by file - you can answer: Yes-to-all). Credits to @vmsnomad for pointing this out.

这将创建一个 zip 存档,行尾从 CRLF 更改为 CR。unzip然后将转换后的文件放回原位(并逐个文件询问您 - 您可以回答:全部是)。感谢@vmsnomad 指出这一点。

回答by ghostdog74

tr -d "\r" < file

take a look herefor examples using sed:

看看这里的例子使用sed

# IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//'               # assumes that all lines end with CR/LF
sed 's/^M$//'              # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//'            # works on ssed, gsed 3.02.80 or higher

# IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\r`/"            # command line under ksh
sed 's/$'"/`echo \\r`/"             # command line under bash
sed "s/$/`echo \\r`/"               # command line under zsh
sed 's/$/\r/'                        # gsed 3.02.80 or higher

Use sed -ifor in-place conversion e.g. sed -i 's/..../' file.

使用sed -i就地转化例如sed -i 's/..../' file

回答by Gordon Davisson

The solutions posted so far only deal with part of the problem, converting DOS/Windows' CRLF into Unix's LF; the part they're missing is that DOS use CRLF as a line separator, while Unix uses LF as a line terminator. The difference is that a DOS file (usually) won't have anything after the last line in the file, while Unix will. To do the conversion properly, you need to add that final LF (unless the file is zero-length, i.e. has no lines in it at all). My favorite incantation for this (with a little added logic to handle Mac-style CR-separated files, and not molest files that're already in unix format) is a bit of perl:

目前发布的解决方案只解决了部分问题,将DOS/Windows的CRLF转换为Unix的LF;他们缺少的部分是 DOS 使用 CRLF 作为行分隔符,而 Unix 使用 LF 作为行终止符。不同之处在于 DOS 文件(通常)在文件的最后一行之后不会有任何内容,而 Unix 会。要正确进行转换,您需要添加最终的 LF(除非文件长度为零,即根本没有行)。我最喜欢的咒语(添加了一些逻辑来处理 Mac 风格的 CR 分隔文件,而不是已经是 unix 格式的骚扰文件)有点 perl:

perl -pe 'if ( s/\r\n?/\n/g ) { $f=1 }; if ( $f || ! $m ) { s/([^\n])\z/\n/ }; $m=1' PCfile.txt

Note that this sends the Unixified version of the file to stdout. If you want to replace the file with a Unixified version, add perl's -iflag.

请注意,这会将文件的 Unixified 版本发送到 stdout。如果您想用 Unixified 版本替换文件,请添加 perl 的-i标志。

回答by Norman Ramsey

This problem can be solved with standard tools, but there are sufficiently many traps for the unwary that I recommend you install the flipcommand, which was written over 20 years ago by Rahul Dhesi, the author of zoo. It does an excellent job converting file formats while, for example, avoiding the inadvertant destruction of binary files, which is a little too easy if you just race around altering every CRLF you see...

这个问题可以用标准工具解决,但是对于粗心的人来说,有足够多的陷阱,我建议你安装这个flip命令,它是 20 多年前由 Rahul Dhesi 编写的,zoo. 它在转换文件格式方面做得很好,同时,例如,避免了二进制文件的无意破坏,如果您只是争先恐后地改变您看到的每个 CRLF,这有点太容易了...

回答by mercergeoinfo

I tried sed 's/^M$//' file.txt on OSX as well as several other methods (http://www.thingy-ma-jig.co.uk/blog/25-11-2010/fixing-dos-line-endingsor http://hintsforums.macworld.com/archive/index.php/t-125.html). None worked, the file remained unchanged (btw Ctrl-v Enter was needed to reproduce ^M). In the end I used TextWrangler. Its not strictly command line but it works and it doesn't complain.

我在 OSX 上尝试了 sed 's/^M$//' file.txt 以及其他几种方法(http://www.thingy-ma-jig.co.uk/blog/25-11-2010/fixing- dos-line-endingshttp://hintsforums.macworld.com/archive/index.php/t-125.html)。没有任何效果,文件保持不变(顺便说一句,需要按 Ctrl-v Enter 来重现 ^M)。最后我使用了 TextWrangler。它不是严格的命令行,但它可以工作并且不会抱怨。

回答by anatoly techtonik

If you don't have access to dos2unix, but can read this page, then you can copy/paste dos2unix.pyfrom here.

如果您无权访问dos2unix,但可以阅读此页面,则可以从此处复制/粘贴dos2unix.py

#!/usr/bin/env python
"""\
convert dos linefeeds (crlf) to unix (lf)
usage: dos2unix.py <input> <output>
"""
import sys

if len(sys.argv[1:]) != 2:
  sys.exit(__doc__)

content = ''
outsize = 0
with open(sys.argv[1], 'rb') as infile:
  content = infile.read()
with open(sys.argv[2], 'wb') as output:
  for line in content.splitlines():
    outsize += len(line) + 1
    output.write(line + '\n')

print("Done. Saved %s bytes." % (len(content)-outsize))

Cross-posted from superuser.

超级用户交叉发布。

回答by Steven Penny

Doing this with POSIX is tricky:

用 POSIX 做到这一点很棘手:

  • POSIX Seddoes not support \ror \15. Even if it did, the in place option -iis not POSIX

  • POSIX Awkdoes support \rand \15, however the -i inplaceoption is not POSIX

  • d2uand dos2unixare not POSIX utilities, but exis

  • POSIX exdoes not support \r, \15, \nor \12

  • POSIX Sed不支持\r\15。即使是这样,就地选项-i也不是 POSIX

  • POSIX Awk确实支持\rand \15,但是该-i inplace选项不是 POSIX

  • d2udos2unix不是POSIX 实用程序,但ex

  • POSIX ex不支持\r, \15,\n\12

To remove carriage returns:

删除回车:

ex -bsc '%!awk "{sub(/\r/,\"\")}1"' -cx file

To add carriage returns:

添加回车:

ex -bsc '%!awk "{sub(/$/,\"\r\")}1"' -cx file

回答by Ashley Raiteri

For Mac osx if you have homebrew installed [http://brew.sh/][1]

对于 Mac osx,如果您安装了自制软件 [ http://brew.sh/][1]

brew install dos2unix

for csv in *.csv; do dos2unix -c mac ${csv}; done;

Make sure you have made copies of the files, as this command will modify the files in place. The -c mac option makes the switch to be compatible with osx.

确保您已经制作了文件的副本,因为此命令将修改文件。-c mac 选项使开关与 osx 兼容。

回答by nawK

An even simpler awk solution w/o a program:

一个更简单的 awk 解决方案 w/oa 程序:

awk -v ORS='\r\n' '1' unix.txt > dos.txt

Technically '1' is your program, b/c awk requires one when given option.

从技术上讲,'1' 是您的程序,b/c awk 在给定选项时需要一个。

UPDATE: After revisiting this page for the first time in a long time I realized that no one has yet posted an internal solution, so here is one:

更新:在很长一段时间内第一次重新访问此页面后,我意识到还没有人发布内部解决方案,所以这里是一个:

while IFS= read -r line;
do printf '%s\n' "${line%$'\r'}";
done < dos.txt > unix.txt