Linux 如何在 Unix 行尾转换 Windows 行尾(CR/LF 到 LF)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3891076/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert Windows end of line in Unix end of line (CR/LF to LF)
提问by MaikoID
I'm a Java developer and I'm using Ubuntu to develop. The project was created in Windows with Eclipse and it's using the CP1252 encoding.
我是一名 Java 开发人员,我正在使用 Ubuntu 进行开发。该项目是在 Windows 中使用 Eclipse 创建的,它使用 CP1252 编码。
To convert to UTF-8 I've used the recode program:
要转换为 UTF-8,我使用了重新编码程序:
find Web -iname \*.java | xargs recode CP1252...UTF-8
this command gives this error:
这个命令给出了这个错误:
recode: Web/src/br/cits/projeto/geral/presentation/GravacaoMessageHelper.java failed: Ambiguous output in step `CR-LF..data
I've serached about it and get the solution here: http://fvue.nl/wiki/Bash_and_Windows#Recode:_Ambiguous_output_in_step_.60data..CR-LF.27and it says:
我已经搜索了它并在此处获得解决方案:http://fvue.nl/wiki/Bash_and_Windows#Recode:_Ambiguous_output_in_step_.60data..CR-LF.27它说:
Convert line endings from CR/LF to a single LF: Edit the file with vim , give the command :set ff=unix and save the file. Recode now should run without errors.
将行尾从 CR/LF 转换为单个 LF:使用 vim 编辑文件,给出命令 :set ff=unix 并保存文件。现在重新编码应该没有错误地运行。
Nice but I've many files to remove the CR/LF character, I can't open each to do it. Vi doesn't provide any option to command line for bash operations.
很好,但我有很多文件要删除 CR/LF 字符,我无法打开每个文件。Vi 没有为 bash 操作的命令行提供任何选项。
sed can be use to do this ? How ?
sed 可以用来做这个吗?如何 ?
Thankx =)
谢谢x =)
回答by cHao
There should be a program called dos2unix
that will fix line endings for you. If it's not already on your Linux box, it should be available via the package manager.
应该有一个程序dos2unix
可以为您修复行尾。如果它不在你的 Linux 机器上,它应该可以通过包管理器使用。
回答by KeithL
The tr command can also do this:
tr 命令也可以这样做:
tr -d '\15\32' < winfile.txt > unixfile.txt
tr -d '\15\32' <winfile.txt> unixfile.txt
and should be available to you.
并且应该可供您使用。
You'll need to run tr from within a script, since it cannot work with file names. For example, create a file myscript.sh:
您需要在脚本中运行 tr ,因为它不能处理文件名。例如,创建一个文件 myscript.sh:
#!/bin/bash
cd
for f in `find -iname \*.java`; do
echo $f
tr -d '' < $f > $f.tr
mv $f.tr $f
recode CP1252...UTF-8 $f
done
Running myscript.sh Web would process all the java files in folder Web.
运行 myscript.sh Web 将处理文件夹 Web 中的所有 java 文件。
回答by Jonathan
Go back to Windows, tell Eclipse to change the encoding to UTF-8, then back to Unix and run d2u
on the files.
返回 Windows,告诉 Eclipse 将编码更改为 UTF-8,然后返回 Unix 并d2u
在文件上运行。
回答by Anthony O.
Did you try the python script by Bryan Maupin found here? (I've modified it a little bit to be more generic)
您是否尝试过此处找到的 Bryan Maupin的Python 脚本?(我对其进行了一些修改以使其更通用)
#!/usr/bin/env python
import sys
input_file_name = sys.argv[1]
output_file_name = sys.argv[2]
input_file = open(input_file_name)
output_file = open(output_file_name, 'w')
line_number = 0
for input_line in input_file:
line_number += 1
try: # first try to decode it using cp1252 (Windows, Western Europe)
output_line = input_line.decode('cp1252').encode('utf8')
except UnicodeDecodeError, error: # if there's an error
sys.stderr.write('ERROR (line %s):\t%s\n' % (line_number, error)) # write to stderr
try: # then if that fails, try to decode using latin1 (ISO 8859-1)
output_line = input_line.decode('latin1').encode('utf8')
except UnicodeDecodeError, error: # if there's an error
sys.stderr.write('ERROR (line %s):\t%s\n' % (line_number, error)) # write to stderr
sys.exit(1) # and just keep going
output_file.write(output_line)
input_file.close()
output_file.close()
You can use that script with
您可以使用该脚本
$ ./cp1252_utf8.py file_cp1252.sql file_utf8.sql
回答by V_V
In order to overcome
为了克服
Ambiguous output in step `CR-LF..data'
simply solution might be to add -f
flag to force conversion.
简单的解决方案可能是添加-f
标志以强制转换。
回答by Jichao
sed cannot match \n because the trailing newline is removed before the line is put into the pattern space but can match \r, so you can convert \r\n (dos) to \n (unix) by removing \r
sed 无法匹配 \n 因为在将行放入模式空间之前删除了尾随换行符但可以匹配 \r,因此您可以通过删除 \r 将 \r\n (dos) 转换为 \n (unix)
sed -i 's/\r//g' file
Warning:this will change the original file
警告:这将更改原始文件
However, you cannot change from unix EOL to dos or old mac (\r) by this. More readings here:
但是,您不能通过此方式从 unix EOL 更改为 dos 或旧 mac (\r)。更多阅读在这里:
回答by Arandur
Actually, vim does allow what you're looking for. Enter vim, and type the following commands:
实际上,vim 确实允许您寻找。输入 vim,然后键入以下命令:
:args **/*.java
:argdo set ff=unix | update | next
The first of these commands sets the argument list to every file matching **/*.java
, which is all Java files, recursively. The second of these commands does the following to each file in the argument list, in turn:
这些命令中的第一个将参数列表设置为每个匹配的文件**/*.java
,即所有 Java 文件,递归。这些命令中的第二个命令依次对参数列表中的每个文件执行以下操作:
- Sets the line-endings to Unix style (you already know this)
- Writes the file out iff it's been changed
- Proceeds to the next file
- 将行尾设置为 Unix 风格(你已经知道了)
- 如果文件已更改,则将文件写出
- 进入下一个文件
回答by John Chesshir
I'll take a little exception to jichao's answer. You can actually do everything he just talked about fairly easily. Instead of looking for a \n, just look for form feed at the end of the line.
我会对jichao的回答有点例外。你实际上可以很容易地完成他刚刚谈到的一切。而不是寻找 \n,只需在行尾寻找换页符。
sed -i 's/\r$//' ${FILE_NAME}
To change from unix back to dos, simply look for the last character on the line and add a form feed to it. (I'll add -r to make this easier with grep regular expressions.)
要将 unix 改回 dos,只需查找该行的最后一个字符并为其添加换页符即可。(我将添加 -r 以便使用 grep 正则表达式更容易。)
sed -ri 's/(.)$/\r/' ${FILE_NAME}
Theoretically, the file could be changed to mac style by adding code to the last example that also appends the next line of input to the first line until all lines have been processed. I won't try to make that example here, though.
从理论上讲,可以通过向最后一个示例添加代码将文件更改为 mac 样式,该代码还将下一行输入附加到第一行,直到所有行都被处理。不过,我不会在这里举这个例子。
Warning:-i changes the actual file. If you want a backup to be made, add a string of characters after -i. This will move the existing file to a file with the same name with your characters added to the end.
警告:-i 更改实际文件。如果要进行备份,请在 -i 后添加一串字符。这会将现有文件移动到具有相同名称的文件中,并在末尾添加您的字符。