导入 CSV 文件时 Python 3 中的 UnicodeDecodeError
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12752313/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
UnicodeDecodeError in Python 3 when importing a CSV file
提问by Ryan Rapini
I'm trying to import a CSV, using this code:
我正在尝试使用以下代码导入 CSV:
import csv
import sys
def load_csv(filename):
# Open file for reading
file = open(filename, 'r')
# Read in file
return csv.reader(file, delimiter=',', quotechar='\n')
def main(argv):
csv_file = load_csv("myfile.csv")
for item in csv_file:
print(item)
if __name__ == "__main__":
main(sys.argv[1:])
Here's a sample of my csv file:
这是我的 csv 文件示例:
foo,bar,test,1,2
this,wont,work,because,α
And the error:
和错误:
Traceback (most recent call last):
File "test.py", line 22, in <module>
main(sys.argv[1:])
File "test.py", line 18, in main
for item in csv_file:
File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 40: ordinal not in range(128)
Obviously, It's hitting the character at the end of the CSV and throwing that error, but I'm at a loss as to how to fix this. Any help?
显然,它击中了 CSV 末尾的角色并抛出该错误,但我不知道如何解决这个问题。有什么帮助吗?
This is:
这是:
Python 3.2.3 (default, Apr 23 2012, 23:35:30)
[GCC 4.7.0 20120414 (prerelease)] on linux2
采纳答案by jfs
It seems your problem boils down to:
看来你的问题归结为:
print("α")
You could fix it by specifying PYTHONIOENCODING:
您可以通过指定来修复它PYTHONIOENCODING:
$ PYTHONIOENCODING=utf-8 python3 test.py > output.txt
Note:
笔记:
$ python3 test.py
should work as is if your terminal configuration supports it, where test.py:
如果您的终端配置支持它,应该可以正常工作,其中test.py:
import csv
with open('myfile.csv', newline='', encoding='utf-8') as file:
for row in csv.reader(file):
print(row)
If open()has no encodingparameter above then you'll get UnicodeDecodeErrorwith LC_ALL=C.
如果open()没有encoding上面的参数,那么你会得到UnicodeDecodeError有LC_ALL=C。
Also with LC_ALL=Cyou'll get UnicodeEncodeErroreven if there is no redirection i.e., PYTHONIOENCODINGis necessary in this case.
即使没有重定向,LC_ALL=C你也会得到,UnicodeEncodeError即,PYTHONIOENCODING在这种情况下是必要的。
回答by TheDude
From the python docs, you have to set the encoding for the file. Here is an example from the site:
从python docs,您必须设置文件的编码。以下是该网站的示例:
import csv
with open('some.csv', newline='', encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
print(row)
Edit: Your problem appears to happen with printing. Try using pretty printer:
编辑:您的问题似乎与打印有关。尝试使用漂亮的打印机:
import csv
import pprint
with open('some.csv', newline='', encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
pprint.pprint(row)
回答by Ayush Abhijeet
Another option is to cover up the errors by passing an error handler:
另一种选择是通过传递错误处理程序来掩盖错误:
with open('some.csv', newline='', errors='replace') as f:
reader = csv.reader(f)
for row in reader:
print(row)
which will replace any undecodable bytes in the file with a "missing character".
这将用“缺失字符”替换文件中任何不可解码的字节。

