如何在 Linux 上使用 grep 搜索包含 dos 行结尾(CRLF)的文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/73833/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you search for files containing dos line endings (CRLF) with grep on Linux?
提问by Tim Abell
I want to search for files containing dos line endings with grep on Linux. Something like this:
我想在 Linux 上搜索包含以 grep 结尾的 dos 行的文件。像这样的东西:
grep -IUr --color '\r\n' .
The above seems to match for literal rn
which is not what is desired.
以上似乎与文字匹配,rn
这不是我们想要的。
The output of this will be piped through xargs into todos to convert crlf to lf like this
这个输出将通过 xargs 传送到 todos 以将 crlf 转换为 lf 像这样
grep -IUrl --color '^M' . | xargs -ifile fromdos 'file'
采纳答案by pjz
Use Ctrl+V, Ctrl+Mto enter a literal Carriage Return character into your grep string. So:
使用Ctrl+ V, Ctrl+M在 grep 字符串中输入文字回车符。所以:
grep -IUr --color "^M"
will work - if the ^M
there is a literal CR that you input as I suggested.
会起作用 - 如果^M
您按照我的建议输入了文字 CR。
If you want the list of files, you want to add the -l
option as well.
如果您想要文件列表,您还想添加该-l
选项。
Explanation
解释
-I
ignore binary files-U
prevents grep to strip CR characters. By default it would do it if it decides it's a text file.-r
read all files under each directory recursively.
-I
忽略二进制文件-U
防止 grep 去除 CR 字符。默认情况下,如果它决定它是一个文本文件,它就会这样做。-r
递归读取每个目录下的所有文件。
回答by Thomee
grep probably isn't the tool you want for this. It will print a line for every matching line in every file. Unless you want to, say, run todos 10 times on a 10 line file, grep isn't the best way to go about it. Using find to run file on every file in the tree then grepping through that for "CRLF" will get you one line of output for each file which has dos style line endings:
grep 可能不是您想要的工具。它将为每个文件中的每个匹配行打印一行。除非你想在一个 10 行的文件上运行 todos 10 次,否则 grep 不是最好的方法。使用 find 在树中的每个文件上运行文件,然后为“CRLF”搜索该文件将为每个具有 dos 样式行结尾的文件获取一行输出:
find . -not -type d -exec file "{}" ";" | grep CRLF
will get you something like:
会给你类似的东西:
./1/dos1.txt: ASCII text, with CRLF line terminators
./2/dos2.txt: ASCII text, with CRLF line terminators
./dos.txt: ASCII text, with CRLF line terminators
回答by Linulin
If your version of grep supports -P (--perl-regexp)option, then
如果您的 grep 版本支持-P (--perl-regexp)选项,则
grep -lUP '\r$'
could be used.
可用于。
回答by yabt
# list files containing dos line endings (CRLF)
cr="$(printf "\r")" # alternative to ctrl-V ctrl-M
grep -Ilsr "${cr}$" .
grep -Ilsr $'\r$' . # yet another & even shorter alternative
回答by Peter Y
The query was search... I have a similar issue... somebody submitted mixed line
endings into the version control, so now we have a bunch of files with 0x0d
0x0d
0x0a
line endings. Note that
查询是搜索......我有一个类似的问题......有人将混合行尾提交到版本控制中,所以现在我们有一堆带有0x0d
0x0d
0x0a
行尾的文件。注意
grep -P '\x0d\x0a'
finds all lines, whereas
查找所有行,而
grep -P '\x0d\x0d\x0a'
and
和
grep -P '\x0d\x0d'
finds no lines so there may be something "else" going on inside grep when it comes to line ending patterns... unfortunately for me!
找不到任何行,因此当涉及到行结束模式时,grep 内部可能会发生“其他”事情……对我来说不幸的是!
回答by MykennaC
If, like me, your minimalist unix doesn't include niceties like the filecommand, and backslashes in your grepexpressions just don't cooperate, try this:
如果像我一样,您的极简 unix 不包含像file命令这样的细节,并且grep表达式中的反斜杠不配合,请尝试以下操作:
$ for file in `find . -type f` ; do
> dump $file | cut -c9-50 | egrep -m1 -q ' 0d| 0d'
> if [ $? -eq 0 ] ; then echo $file ; fi
> done
Modifications you may want to make to the above include:
您可能想要对上述内容进行的修改包括:
- tweak the findcommand to locate only the files you want to scan
- change the dumpcommand to odor whatever file dump utility you have
- confirm that the cutcommand includes both a leading and trailing space as well as just the hexadecimal character output from the dumputility
- limit the dumpoutput to the first 1000 characters or so for efficiency
- 调整find命令以仅定位您要扫描的文件
- 将转储命令更改为od或您拥有的任何文件转储实用程序
- 确认cut命令包括前导和尾随空格以及转储实用程序的十六进制字符输出
- 将转储输出限制为前 1000 个字符左右以提高效率
For example, something like this may work for you using odinstead of dump:
例如,使用od而不是dump可能对您有用:
od -t x2 -N 1000 $file | cut -c8- | egrep -m1 -q ' 0d| 0d|0d$'
回答by Steven Penny
回答by Murali Krishna Parimi
You can use file command in unix. It gives you the character encoding of the file along with line terminators.
您可以在 unix 中使用 file 命令。它为您提供文件的字符编码以及行终止符。
$ file myfile
myfile: ISO-8859 text, with CRLF line terminators
$ file myfile | grep -ow CRLF
CRLF
回答by dessert
dos2unix
has a file information option which can be used to show the files that would be converted:
dos2unix
有一个文件信息选项,可用于显示将被转换的文件:
dos2unix -ic /path/to/file
To do that recursively you can use bash
's globstar
option, which for the current shell is enabled with shopt -s globstar
:
要递归地执行此操作,您可以使用bash
'sglobstar
选项,该选项为当前 shell 启用shopt -s globstar
:
dos2unix -ic ** # all files recursively
dos2unix -ic **/file # files called “file” recursively
Alternatively you can use find
for that:
或者,您可以使用find
:
find -exec dos2unix -ic {} + # all files recursively
find -name file -exec dos2unix -ic {} + # files called “file” recursively