Python 从 csv 文件中删除换行符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14390123/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Removing newline from a csv file
提问by ganesh reddy
I am trying to process a csv file in python that has ^M character in the middle of each row/line which is a newline. I cant open the file in any mode other than 'rU'.
我正在尝试在 python 中处理一个 csv 文件,该文件在每行/行的中间有一个 ^M 字符,这是一个换行符。我无法以“rU”以外的任何模式打开文件。
If I do open the file in the 'rU' mode, it reads in the newline and splits the file (creating a newline) and gives me twice the number of rows.
如果我在 'rU' 模式下打开文件,它会读入换行符并拆分文件(创建换行符)并给我两倍的行数。
I want to remove the newline altogether. How?
我想完全删除换行符。如何?
采纳答案by abarnert
Note that, as the docssay:
请注意,正如文档所说:
csvfilecan be any object which supports the iterator protocol and returns a string each time its
next()method is called — file objects and list objects are both suitable.
csvfile可以是任何支持迭代器协议并在每次
next()调用其方法时返回一个字符串的对象——文件对象和列表对象都适用。
So, you can always stick a filter on the file before handing it to your readeror DictReader. Instead of this:
因此,在将文件交给您的reader或之前,您始终可以在文件上粘贴过滤器DictReader。取而代之的是:
with open('myfile.csv', 'rU') as myfile:
for row in csv.reader(myfile):
Do this:
做这个:
with open('myfile.csv', 'rU') as myfile:
filtered = (line.replace('\r', '') for line in myfile)
for row in csv.reader(filtered):
That '\r'is the Python (and C) way of spelling ^M. So, this just strips all ^Mcharacters out, no matter where they appear, by replacing each one with an empty string.
那'\r'是 Python(和 C)的拼写方式^M。因此,这只是^M通过用空字符串替换每个字符来删除所有字符,无论它们出现在哪里。
I guess I want to modify the file permanently as opposed to filtering it.
我想我想永久修改文件而不是过滤它。
First, if you want to modify the file before running your Python script on it, why not do that from outside of Python? sed, tr, many text editors, etc. can all do this for you. Here's a GNU sed example:
首先,如果您想在运行 Python 脚本之前修改文件,为什么不在 Python 之外进行修改呢?sed, tr, 许多文本编辑器等都可以为您做到这一点。这是一个 GNU sed 示例:
gsed -i'' 's/\r//g' myfile.csv
But if you want to do it in Python, it's not that much more verbose, and you might find it more readable, so:
但是如果你想用 Python 来做,它并没有那么冗长,你可能会发现它更具可读性,所以:
First, you can't really modify a file in-place if you want to insert or delete from the middle. The usual solution is to write a new file, and either move the new file over the old one (Unix only) or delete the old one (cross-platform).
首先,如果要从中间插入或删除,则无法真正就地修改文件。通常的解决方案是编写一个新文件,然后将新文件移到旧文件上(仅限 Unix)或删除旧文件(跨平台)。
The cross-platform version:
跨平台版本:
os.rename('myfile.csv', 'myfile.csv.bak')
with open('myfile.csv.bak', 'rU') as infile, open('myfile.csv', 'wU') as outfile:
for line in infile:
outfile.write(line.replace('\r'))
os.remove('myfile.csv.bak')
The less-clunky, but Unix-only, version:
不那么笨重但仅限 Unix 的版本:
temp = tempfile.NamedTemporaryFile(delete=False)
with open('myfile.csv', 'rU') as myfile, closing(temp):
for line in myfile:
temp.write(line.replace('\r'))
os.rename(tempfile.name, 'myfile.csv')

