bash 如何检测文件中的 DOS 换行符？

Question

提问by chiggsy

I have a bunch of files. Some are Unix line endings, many are DOS. I'd like to test each file to see if if is dos formatted, before I switch the line endings.

我有一堆文件。有些是 Unix 行尾，很多是 DOS。在切换行尾之前，我想测试每个文件以查看是否为 dos 格式。

How would I do this? Is there a flag I can test for? Something similar?

我该怎么做？有我可以测试的标志吗？相似的东西？

Answer 1

采纳答案by nc3b

You could search the string for \r\n. That's DOS style line ending.

您可以在字符串中搜索\r\n. 那是 DOS 风格的行尾。

EDIT: Take a look at this

编辑：看看这个

Answer 2

回答by Eric O Lebigot

Python can automatically detect what newline convention is used in a file, thanks to the "universal newline mode" (U), and you can access Python's guess through the newlinesattribute of file objects:

Python 可以自动检测文件中使用了什么换行约定，这要归功于“通用换行模式”（U），并且可以通过newlines文件对象的属性访问 Python 的猜测：

f = open('myfile.txt', 'U')
f.readline()  # Reads a line
# The following now contains the newline ending of the first line:
# It can be "\r\n" (Windows), "\n" (Unix), "\r" (Mac OS pre-OS X).
# If no newline is found, it contains None.
print repr(f.newlines)

This gives the newline ending of the first line (Unix, DOS, etc.), if any.

这给出了第一行（Unix、DOS 等）的换行符结尾，如果有的话。

As John M. pointed out, if by any chance you have a pathological file that uses more than one newline coding, f.newlinesis a tuple with all the newline codings found so far, after reading many lines.

正如约翰 M. 指出的那样，如果您有一个使用多个换行符编码的病理文件，f.newlines那么在阅读了许多行之后，它是一个包含迄今为止发现的所有换行符编码的元组。

Reference: http://docs.python.org/2/library/functions.html#open

参考：http: //docs.python.org/2/library/functions.html#open

If you just want to convert a file, you can simply do:

如果您只想转换文件，只需执行以下操作：

with open('myfile.txt', 'U') as infile:
    text = infile.read()  # Automatic ("Universal read") conversion of newlines to "\n"
with open('myfile.txt', 'w') as outfile:
    outfile.write(text)  # Writes newlines for the platform running the program

Answer 3

回答by johntellsall

(Python 2 only:) If you just want to read text files, either DOS or Unix-formatted, this works:

（仅限Python 2 :) 如果您只想读取文本文件，无论是 DOS 还是 Unix 格式，这都有效：

print open('myfile.txt', 'U').read()

That is, Python's "universal" file reader will automatically use all the different end of line markers, translating them to "\n".

也就是说，Python 的“通用”文件阅读器将自动使用所有不同的行尾标记，将它们转换为“\n”。

http://docs.python.org/library/functions.html#open

(Thanks handle!)

（谢谢把手！）

Answer 4

回答by Jonik

As a complete Python newbie & just for fun, I tried to find some minimalistic way of checking this for one file. This seems to work:

作为一个完整的 Python 新手并且只是为了好玩，我试图找到一些简单的方法来检查一个文件。这似乎有效：

if "\r\n" in open("/path/file.txt","rb").read():
    print "DOS line endings found"

Edit: simplified as per John Machin's comment (no need to use regular expressions).

编辑：根据 John Machin 的评论进行简化（无需使用正则表达式）。

Answer 5

回答by Femaref

dos linebreaks are \r\n, unix only \n. So just search for \r\n.

dos 换行符是\r\n, 仅 Unix \n。所以只需搜索\r\n.

Answer 6

回答by shallo

Using grep & bash:

使用 grep 和 bash：

grep -c -m 1 $'\r$' file

echo $'\r\n\r\n' | grep -c $'\r$'     # test

echo $'\r\n\r\n' | grep -c -m 1 $'\r$'

Answer 7

回答by Cito

You can use the following function (which should work in Python 2 and Python 3) to get the newline representation used in an existing text file. All three possible kinds are recognized. The function reads the file only up to the first newline to decide. This is faster and less memory consuming when you have larger text files, but it does not detect mixed newline endings.

您可以使用以下函数（应该在 Python 2 和 Python 3 中工作）来获取现有文本文件中使用的换行符。所有三种可能的类型都被识别。该函数只读取文件直到第一个换行符来决定。当您有较大的文本文件时，这会更快且内存消耗更少，但它不会检测混合换行符结尾。

In Python 3, you can then pass the output of this function to the newlineparameter of the openfunction when writing the file. This way you can alter the context of a text file without changing its newline representation.

在 Python 3 中，您可以在写入文件时将此函数的输出传递给函数的newline参数open。通过这种方式，您可以更改文本文件的上下文，而无需更改其换行表示。

def get_newline(filename):
    with open(filename, "rb") as f:
        while True:
            c = f.read(1)
            if not c or c == b'\n':
                break
            if c == b'\r':
                if f.read(1) == b'\n':
                    return '\r\n'
                return '\r'
    return '\n'

bash 如何检测文件中的 DOS 换行符？

提问by chiggsy

采纳答案by nc3b

回答by Eric O Lebigot

回答by johntellsall

回答by Jonik

回答by Femaref

回答by shallo

回答by Cito

相关推荐

最近更新

标签

bash 如何检测文件中的 DOS 换行符？

提问by chiggsy

采纳答案by nc3b

回答by Eric O Lebigot

回答by johntellsall

回答by Jonik

回答by Femaref

回答by shallo

回答by Cito

相关推荐

bash 如何删除bash中的前导部分

在 bash 中有条件的浮动

bash 使用命令的输出作为下一个命令的输入

SVN：和 bash：如何判断是否有未提交的更改

相关推荐

最近更新

标签