导入 CSV 文件时 Python 3 中的 UnicodeDecodeError

Question

提问by Ryan Rapini

I'm trying to import a CSV, using this code:

我正在尝试使用以下代码导入 CSV：

    import csv
    import sys

    def load_csv(filename):
        # Open file for reading
        file = open(filename, 'r')

        # Read in file
        return csv.reader(file, delimiter=',', quotechar='\n')

    def main(argv):
        csv_file = load_csv("myfile.csv")

        for item in csv_file:
            print(item)

    if __name__ == "__main__":
        main(sys.argv[1:])

Here's a sample of my csv file:

这是我的 csv 文件示例：

    foo,bar,test,1,2
    this,wont,work,because,α

And the error:

和错误：

    Traceback (most recent call last):
      File "test.py", line 22, in <module>
        main(sys.argv[1:])
      File "test.py", line 18, in main
        for item in csv_file:
      File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 40: ordinal not in range(128)

Obviously, It's hitting the character at the end of the CSV and throwing that error, but I'm at a loss as to how to fix this. Any help?

显然，它击中了 CSV 末尾的角色并抛出该错误，但我不知道如何解决这个问题。有什么帮助吗？

This is:

这是：

    Python 3.2.3 (default, Apr 23 2012, 23:35:30)
    [GCC 4.7.0 20120414 (prerelease)] on linux2

Answer 1

采纳答案by jfs

It seems your problem boils down to:

看来你的问题归结为：

print("α")

You could fix it by specifying PYTHONIOENCODING:

您可以通过指定来修复它PYTHONIOENCODING：

$ PYTHONIOENCODING=utf-8 python3 test.py > output.txt

Note:

笔记：

$ python3 test.py

should work as is if your terminal configuration supports it, where test.py:

如果您的终端配置支持它，应该可以正常工作，其中test.py：

import csv

with open('myfile.csv', newline='', encoding='utf-8') as file:
    for row in csv.reader(file):
        print(row)

If open()has no encodingparameter above then you'll get UnicodeDecodeErrorwith LC_ALL=C.

如果open()没有encoding上面的参数，那么你会得到UnicodeDecodeError有LC_ALL=C。

Also with LC_ALL=Cyou'll get UnicodeEncodeErroreven if there is no redirection i.e., PYTHONIOENCODINGis necessary in this case.

即使没有重定向，LC_ALL=C你也会得到，UnicodeEncodeError即，PYTHONIOENCODING在这种情况下是必要的。

Answer 2

回答by TheDude

From the python docs, you have to set the encoding for the file. Here is an example from the site:

从python docs，您必须设置文件的编码。以下是该网站的示例：

import csv

 with open('some.csv', newline='', encoding='utf-8') as f:
   reader = csv.reader(f)
   for row in reader:
     print(row)

Edit: Your problem appears to happen with printing. Try using pretty printer:

编辑：您的问题似乎与打印有关。尝试使用漂亮的打印机：

import csv
import pprint

with open('some.csv', newline='', encoding='utf-8') as f:
  reader = csv.reader(f)
  for row in reader:
    pprint.pprint(row)

Answer 3

回答by Ayush Abhijeet

Another option is to cover up the errors by passing an error handler:

另一种选择是通过传递错误处理程序来掩盖错误：

with open('some.csv', newline='', errors='replace') as f:
   reader = csv.reader(f)
   for row in reader:
    print(row)

which will replace any undecodable bytes in the file with a "missing character".

这将用“缺失字符”替换文件中任何不可解码的字节。

导入 CSV 文件时 Python 3 中的 UnicodeDecodeError

提问by Ryan Rapini

采纳答案by jfs

回答by TheDude

回答by Ayush Abhijeet

相关推荐

最近更新

标签

导入 CSV 文件时 Python 3 中的 UnicodeDecodeError

提问by Ryan Rapini

采纳答案by jfs

回答by TheDude

回答by Ayush Abhijeet

相关推荐

Python 类静态方法

使用字典将月份数字转换为月份名称的基本 Python 编程

如何为python安装scipy？

从 excel/vba 调用 python 脚本

相关推荐

最近更新

标签