Linux 如何在 Bash 脚本中将 DOS/Windows 换行符 (CRLF) 转换为 Unix 换行符 (LF)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2613800/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert DOS/Windows newline (CRLF) to Unix newline (LF) in a Bash script?
提问by Koran Molovik
How can I programmatically (i.e., not using vi
) convert DOS/Windows newlines to Unix?
如何以编程方式(即不使用vi
)将 DOS/Windows 换行符转换为 Unix?
The dos2unix
and unix2dos
commands are not available on certain systems. How can I emulate these with commands like sed
/awk
/tr
?
该dos2unix
和unix2dos
命令不可用在某些系统上。如何使用sed
/ awk
/ 之类的命令模拟这些tr
?
回答by codaddict
Using AWK you can do:
使用 AWK,您可以:
awk '{ sub("\r$", ""); print }' dos.txt > unix.txt
Using Perl you can do:
使用 Perl,您可以执行以下操作:
perl -pe 's/\r$//' < dos.txt > unix.txt
回答by Jonathan Leffler
You can use tr
to convert from DOS to Unix; however, you can only do this safely if CR appears in your file only as the first byte of a CRLF byte pair. This is usually the case. You then use:
您可以使用tr
从 DOS 到 Unix 的转换;但是,只有当 CR 仅作为 CRLF 字节对的第一个字节出现在您的文件中时,您才能安全地执行此操作。通常是这种情况。然后你使用:
tr -d '5' <DOS-file >UNIX-file
Note that the name DOS-file
is different from the name UNIX-file
; if you try to use the same name twice, you will end up with no data in the file.
请注意,名称与名称DOS-file
不同UNIX-file
;如果您尝试两次使用相同的名称,则最终文件中将没有数据。
You can't do it the other way round (with standard 'tr').
你不能反过来做(使用标准的“tr”)。
If you know how to enter carriage return into a script (control-V, control-Mto enter control-M), then:
如果您知道如何在脚本中输入回车符 ( control-V,control-M以输入 control-M),则:
sed 's/^M$//' # DOS to Unix
sed 's/$/^M/' # Unix to DOS
where the '^M' is the control-M character. You can also use the bash
ANSI-C Quotingmechanism to specify the carriage return:
其中 '^M' 是 control-M 字符。您还可以使用bash
ANSI-C 引用机制来指定回车:
sed $'s/\r$//' # DOS to Unix
sed $'s/$/\r/' # Unix to DOS
However, if you're going to have to do this very often (more than once, roughly speaking), it is far more sensible to install the conversion programs (e.g. dos2unix
and unix2dos
, or perhaps dtou
and utod
) and use them.
但是,如果您必须经常这样做(粗略地说不止一次),安装转换程序(例如dos2unix
and unix2dos
,或者也许dtou
and utod
)并使用它们要明智得多。
If you need to process entire directories and subdirectories, you can use zip
:
如果需要处理整个目录和子目录,可以使用zip
:
zip -r -ll zipfile.zip somedir/
unzip zipfile.zip
This will create a zip archive with line endings changed from CRLF to CR. unzip
will then put the converted files back in place (and ask you file by file - you can answer: Yes-to-all). Credits to @vmsnomad for pointing this out.
这将创建一个 zip 存档,行尾从 CRLF 更改为 CR。unzip
然后将转换后的文件放回原位(并逐个文件询问您 - 您可以回答:全部是)。感谢@vmsnomad 指出这一点。
回答by ghostdog74
tr -d "\r" < file
take a look herefor examples using sed
:
看看这里的例子使用sed
:
# IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//' # assumes that all lines end with CR/LF
sed 's/^M$//' # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' # works on ssed, gsed 3.02.80 or higher
# IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\r`/" # command line under ksh
sed 's/$'"/`echo \\r`/" # command line under bash
sed "s/$/`echo \\r`/" # command line under zsh
sed 's/$/\r/' # gsed 3.02.80 or higher
Use sed -i
for in-place conversion e.g. sed -i 's/..../' file
.
使用sed -i
就地转化例如sed -i 's/..../' file
。
回答by Gordon Davisson
The solutions posted so far only deal with part of the problem, converting DOS/Windows' CRLF into Unix's LF; the part they're missing is that DOS use CRLF as a line separator, while Unix uses LF as a line terminator. The difference is that a DOS file (usually) won't have anything after the last line in the file, while Unix will. To do the conversion properly, you need to add that final LF (unless the file is zero-length, i.e. has no lines in it at all). My favorite incantation for this (with a little added logic to handle Mac-style CR-separated files, and not molest files that're already in unix format) is a bit of perl:
目前发布的解决方案只解决了部分问题,将DOS/Windows的CRLF转换为Unix的LF;他们缺少的部分是 DOS 使用 CRLF 作为行分隔符,而 Unix 使用 LF 作为行终止符。不同之处在于 DOS 文件(通常)在文件的最后一行之后不会有任何内容,而 Unix 会。要正确进行转换,您需要添加最终的 LF(除非文件长度为零,即根本没有行)。我最喜欢的咒语(添加了一些逻辑来处理 Mac 风格的 CR 分隔文件,而不是已经是 unix 格式的骚扰文件)有点 perl:
perl -pe 'if ( s/\r\n?/\n/g ) { $f=1 }; if ( $f || ! $m ) { s/([^\n])\z/\n/ }; $m=1' PCfile.txt
Note that this sends the Unixified version of the file to stdout. If you want to replace the file with a Unixified version, add perl's -i
flag.
请注意,这会将文件的 Unixified 版本发送到 stdout。如果您想用 Unixified 版本替换文件,请添加 perl 的-i
标志。
回答by Norman Ramsey
This problem can be solved with standard tools, but there are sufficiently many traps for the unwary that I recommend you install the flip
command, which was written over 20 years ago by Rahul Dhesi, the author of zoo
.
It does an excellent job converting file formats while, for example, avoiding the inadvertant destruction of binary files, which is a little too easy if you just race around altering every CRLF you see...
这个问题可以用标准工具解决,但是对于粗心的人来说,有足够多的陷阱,我建议你安装这个flip
命令,它是 20 多年前由 Rahul Dhesi 编写的,zoo
. 它在转换文件格式方面做得很好,同时,例如,避免了二进制文件的无意破坏,如果您只是争先恐后地改变您看到的每个 CRLF,这有点太容易了...
回答by mercergeoinfo
I tried sed 's/^M$//' file.txt on OSX as well as several other methods (http://www.thingy-ma-jig.co.uk/blog/25-11-2010/fixing-dos-line-endingsor http://hintsforums.macworld.com/archive/index.php/t-125.html). None worked, the file remained unchanged (btw Ctrl-v Enter was needed to reproduce ^M). In the end I used TextWrangler. Its not strictly command line but it works and it doesn't complain.
我在 OSX 上尝试了 sed 's/^M$//' file.txt 以及其他几种方法(http://www.thingy-ma-jig.co.uk/blog/25-11-2010/fixing- dos-line-endings或http://hintsforums.macworld.com/archive/index.php/t-125.html)。没有任何效果,文件保持不变(顺便说一句,需要按 Ctrl-v Enter 来重现 ^M)。最后我使用了 TextWrangler。它不是严格的命令行,但它可以工作并且不会抱怨。
回答by anatoly techtonik
If you don't have access to dos2unix, but can read this page, then you can copy/paste dos2unix.pyfrom here.
如果您无权访问dos2unix,但可以阅读此页面,则可以从此处复制/粘贴dos2unix.py。
#!/usr/bin/env python
"""\
convert dos linefeeds (crlf) to unix (lf)
usage: dos2unix.py <input> <output>
"""
import sys
if len(sys.argv[1:]) != 2:
sys.exit(__doc__)
content = ''
outsize = 0
with open(sys.argv[1], 'rb') as infile:
content = infile.read()
with open(sys.argv[2], 'wb') as output:
for line in content.splitlines():
outsize += len(line) + 1
output.write(line + '\n')
print("Done. Saved %s bytes." % (len(content)-outsize))
Cross-posted from superuser.
从超级用户交叉发布。
回答by Steven Penny
Doing this with POSIX is tricky:
用 POSIX 做到这一点很棘手:
POSIX Seddoes not support
\r
or\15
. Even if it did, the in place option-i
is not POSIXPOSIX Awkdoes support
\r
and\15
, however the-i inplace
option is not POSIXd2uand dos2unixare not POSIX utilities, but exis
POSIX exdoes not support
\r
,\15
,\n
or\12
POSIX Sed不支持
\r
或\15
。即使是这样,就地选项-i
也不是 POSIXPOSIX Awk确实支持
\r
and\15
,但是该-i inplace
选项不是 POSIXd2u和dos2unix不是POSIX 实用程序,但ex是
POSIX ex不支持
\r
,\15
,\n
或\12
To remove carriage returns:
删除回车:
ex -bsc '%!awk "{sub(/\r/,\"\")}1"' -cx file
To add carriage returns:
添加回车:
ex -bsc '%!awk "{sub(/$/,\"\r\")}1"' -cx file
回答by Ashley Raiteri
For Mac osx if you have homebrew installed [http://brew.sh/][1]
对于 Mac osx,如果您安装了自制软件 [ http://brew.sh/][1]
brew install dos2unix
for csv in *.csv; do dos2unix -c mac ${csv}; done;
Make sure you have made copies of the files, as this command will modify the files in place. The -c mac option makes the switch to be compatible with osx.
确保您已经制作了文件的副本,因为此命令将修改文件。-c mac 选项使开关与 osx 兼容。
回答by nawK
An even simpler awk solution w/o a program:
一个更简单的 awk 解决方案 w/oa 程序:
awk -v ORS='\r\n' '1' unix.txt > dos.txt
Technically '1' is your program, b/c awk requires one when given option.
从技术上讲,'1' 是您的程序,b/c awk 在给定选项时需要一个。
UPDATE: After revisiting this page for the first time in a long time I realized that no one has yet posted an internal solution, so here is one:
更新:在很长一段时间内第一次重新访问此页面后,我意识到还没有人发布内部解决方案,所以这里是一个:
while IFS= read -r line;
do printf '%s\n' "${line%$'\r'}";
done < dos.txt > unix.txt