如何在linux文件系统中查找dos格式的文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4719750/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-05 02:31:50  来源:igfitidea点击:

How to find dos format files in a linux file system

linuxshellvimfind

提问by denormalizer

I would like to find out which of my files in a directory are dos text files (as opposed to unix text files).

我想找出目录中的哪些文件是 dos 文本文件(而不是 unix 文本文件)。

What I've tried:

我试过的:

find . -name "*.php" | xargs grep ^M -l

It's not giving me reliable results... so I'm looking for a better alternative.

它没有给我可靠的结果......所以我正在寻找更好的选择。

Any suggestions, ideas?

任何建议,想法?

Thanks

谢谢

Clarification

澄清

In addition to what I've said above, the problem is that i have a bunch of dos files with no ^M characters in them (hence my note about reliability).

除了我上面所说的之外,问题是我有一堆没有 ^M 字符的 dos 文件(因此我对可靠性进行了说明)。

The way i currently determine whether a file is dos or not is through Vim, where at the bottom it says:

我目前确定文件是否为 dos 的方式是通过 Vim,在底部它说:

"filename.php" [dos] [noeol]

采纳答案by paxdiablo

Not sure what you mean exactly by "not reliable" but you may want to try:

不确定“不可靠”的确切含义,但您可能想尝试:

find . -name '*.php' -print0 | xargs -0 grep -l '^M$'

This uses the more atrocious-filenames-with-spaces-in-them-friendlyoptions and only finds carriage returns immediately before the end of line.

这使用了更糟糕的文件名和空格在他们友好的选项,并且只在行尾前立即找到回车。

Keep in mind that the ^Mis a single CTRLMcharacter, not twocharacters.

请记住,^M是单个CTRLM字符,而不是两个字符。

And also that it'll list files where even oneline is in DOS mode, which is probably what you want anyway since those would have been UNIX files mangled by a non-UNIX editor.

而且,它会列出文件,甚至一个线在DOS模式下,这可能是你想要什么呢,因为这些会由非UNIX编辑错位了UNIX的文件。



Based on your update that vim is reporting your files as DOS format:

根据您的更新,vim 将您的文件报告为 DOS 格式:

If vim isreporting it as DOS format, then everyline ends with CRLF. That's the way vim works. If even oneline doesn't have CR, then it's considered UNIX format and the ^Mcharacters are visible in the buffer. If it's all DOS format, the ^Mcharacters are not displayed:

如果 vim将其报告为 DOS 格式,则每一行都以CRLF. 这就是 vim 的工作方式。如果连一个行没有CR,那么它被认为是UNIX格式和^M字符在缓冲区可见。如果都是DOS格式,^M则不显示字符:

Vim will look for both dos and unix line endings, but Vim has a built-in preference for the unix format.

- If all lines in the file end with CRLF, the dos file format will be applied, meaning that each CRLF is removed when reading the lines into a buffer, and the buffer 'ff' option will be dos.
- If one or more lines end with LF only, the unix file format will be applied, meaning that each LF is removed (but each CR will be present in the buffer, and will display as ^M), and the buffer 'ff' option will be unix.

Vim 会同时查找 dos 和 unix 行尾,但 Vim 对 unix 格式有一个内置的偏好。

- 如果文件中的所有行都以 CRLF 结尾,则将应用 dos 文件格式,这意味着在将行读入缓冲区时删除每个 CRLF,并且缓冲区 'ff' 选项将为 dos。
- 如果一行或多行仅以 LF 结尾,则将应用 unix 文件格式,这意味着每个 LF 都被删除(但每个 CR 将出现在缓冲区中,并将显示为 ^M),并且缓冲区 'ff'选项将是 unix。

If you reallywant to know what's in the file, don't rely on a too-smart tool like vim :-)

如果您真的想知道文件中的内容,请不要依赖像 vim 这样过于智能的工具 :-)

Use:

用:

od -xcb input_file_name | less

and check the line endings yourself.

并自己检查行尾。

回答by bvpb

How about:

怎么样:

find . -name "*.php" | xargs file | grep "CRLF"

I don't think it is reliable to try and use ^Mto try and find the files.

我认为尝试并使用^M来尝试查找文件是不可靠的。

回答by jmort253

This is much like your original solution; therefore, it's possibly more easy for you to remember:

这很像你原来的解决方案;因此,您可能更容易记住:

find . -name "*.php" | xargs grep "\r" -l

Thought process:

思考过程:

In VIM, to remove the ^M you type:

在 VIM 中,要删除您键入的 ^M:

 %s:/^M//g

Where ^ is your Ctrl key and M is the ENTER key. But I could never remember the keys to type to print that sequence, so I've always removed them using:

其中 ^ 是您的 Ctrl 键,M 是 ENTER 键。但是我永远记不起打印该序列的键入键,所以我总是使用以下方法删除它们:

 %s:/\r//g

So my deduction is that the \r and ^M are equivalent, with the former being easier to remember to type.

所以我的推论是 \r 和 ^M 是等价的,前者更容易记住输入。

回答by ghostdog74

GNU find

GNU 查找

find . -type f -iname "*.php"  -exec file "{}" + | grep CRLF

I don't know what you want to do after you find those DOS php files, but if you want to convert them to unix format, then

我不知道你找到那些DOS php文件后你想做什么,但是如果你想把它们转换成unix格式,那么

find . -type f -iname "*.php"  -exec dos2unix "{}" +;

will suffice. There's no need to specifically check whether they are DOS files or not.

就足够了。没有必要专门检查它们是否是 DOS 文件。

回答by firebus

i had good luck with

我很幸运

find . -name "*.php" -exec grep -Pl "\r" {} \;

回答by skeept

If you prefer vim to tell you which files are in this format you can use the following script:

如果您希望 vim 告诉您哪些文件采用这种格式,您可以使用以下脚本:

"use this script to check which files are in dos format according to vim
"use: in the folder that you want to check
"create a file, say res.txt
"> vim -u NONE --noplugins res.txt
"> in vim: source this_script.vim

python << EOF
import os
import vim

cur_buf =  vim.current.buffer

IGNORE_START = ''.split()
IGNORE_END = '.pyc .swp .png ~'.split()

IGNORE_DIRS = '.hg .git dd_ .bzr'.split()

for dirpath, dirnames, fnames in os.walk(os.curdir):
  for dirn in dirnames:
    for diri in IGNORE_DIRS:
      if dirn.endswith(diri):
        dirnames.remove(dirn)
        break
  for fname in fnames:
    skip = False
    for fstart in IGNORE_START:
      if fname.startswith(fstart):
        skip = True
    for fend in IGNORE_END:
      if fname.endswith(fend):
        skip = True
    if skip is True:
      continue
    fname = os.path.join(dirpath, fname)
    vim.command('view {}'.format(fname))
    curr_ff = vim.eval('&ff')
    if vim.current.buffer != cur_buf:
      vim.command('bw!')
    if curr_ff == 'dos':
      cur_buf.append('{} {}'.format(curr_ff, fname))
EOF

your vim needs to be compiled with python (python is used to loop over the files in the folder, there is probably an easier way of doing this, but I don't really know it....

你的vim需要用python编译(python用于循环文件夹中的文件,可能有一种更简单的方法,但我真的不知道......

回答by duplexddaann

If your dos2unixcommand has the -ioption, you can use that feature to find files in a directory that have DOS line breaks.

如果您的dos2unix命令有该-i选项,您可以使用该功能在具有 DOS 换行符的目录中查找文件。

$ man dos2unix
.
.
.
     -i[FLAGS], --info[=FLAGS] FILE ...
           Display file information. No conversion is done.

    The following information is printed, in this order:
    number of DOS line breaks,
    number of Unix line breaks,
    number of Mac line breaks,
    byte order mark,
    text or binary, file name.
.
.
.
Optionally extra flags can be set to change the (-i) output.
.
.
.
           c   Print only the files that would be converted.

The following one-liner script reads:

以下单行脚本内容如下:

  • findall files in this directory tree,
  • run dos2unixon all files to determine the files to be changed,
  • run dos2unixon files to be changed
  • find此目录树中的所有文件,
  • dos2unix对所有文件运行以确定要更改的文件,
  • dos2unix在要更改的文件上运行

$ find . -type f | xargs -d '\n' dos2unix -ic | xargs -d '\n' dos2unix

$ find . -type f | xargs -d '\n' dos2unix -ic | xargs -d '\n' dos2unix