Python “utf-8”编解码器无法解码位置 4276 中的字节 0xa0：起始字节无效

Question

提问by Vital

I try to read and print the following file: txt.tsv (https://www.sec.gov/files/dera/data/financial-statement-and-notes-data-sets/2017q3_notes.zip)

我尝试阅读并打印以下文件：txt.tsv（https://www.sec.gov/files/dera/data/financial-statement-and-notes-data-sets/2017q3_notes.zip）

According to the SEC the data set is provided in a single encoding, as follows:

根据美国证券交易委员会的说法，数据集以单一编码提供，如下所示：

Tab Delimited Value (.txt): utf-8, tab-delimited, \n- terminated lines, with the first line containing the field names in lowercase.

制表符分隔值 (.txt)：utf-8、制表符分隔、\n- 终止的行，第一行包含小写的字段名称。

My current code:

我目前的代码：

import csv

with open('txt.tsv') as tsvfile:
    reader = csv.DictReader(tsvfile, dialect='excel-tab')
    for row in reader:
        print(row)

All attempts ended with the following error message:

所有尝试都以以下错误消息结束：

'utf-8' codec can't decode byte 0xa0 in position 4276: invalid start byte

“utf-8”编解码器无法解码位置 4276 中的字节 0xa0：起始字节无效

I am a bit lost. Can anyone help me? Many thanks in advance.

我有点失落。谁能帮我？提前谢谢了。

Answer 1

回答by koPytok

Encoding in the file is 'windows-1252'. Use:

文件中的编码是“windows-1252”。用：

open('txt.tsv', encoding='windows-1252')

Answer 2

回答by Hasim D

If someone works on Turkish data, then I suggest this line:

如果有人处理土耳其数据，那么我建议使用这一行：

df = pd.read_csv("text.txt",encoding='windows-1254')

Answer 3

回答by Ghulam Dastgeer

i have the same error message for .csv file, and This Worked for me :

我对 .csv 文件有相同的错误消息，这对我有用：

     df = pd.read_csv('Text.csv',encoding='ANSI')

Answer 4

回答by raj kumar

ds = pd.read_csv('/Dataset/test.csv', encoding='windows-1252')

Works fine for me, thanks.

对我来说很好用，谢谢。

Python “utf-8”编解码器无法解码位置 4276 中的字节 0xa0：起始字节无效

提问by Vital

回答by koPytok

回答by Hasim D

回答by Ghulam Dastgeer

回答by raj kumar

相关推荐

最近更新

标签

Python “utf-8”编解码器无法解码位置 4276 中的字节 0xa0：起始字节无效

提问by Vital

回答by koPytok

回答by Hasim D

回答by Ghulam Dastgeer

回答by raj kumar

相关推荐

Python pip3 错误 - '_NamespacePath' 对象没有属性 'sort'

Python 在 Docker 容器中安装 OpenCV

Python “进程已完成，退出代码为 1”是什么意思？

Python 按索引遍历数据帧

相关推荐

最近更新

标签