Python UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 35 中的字节 0x96:起始字节无效
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45529507/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 35: invalid start byte
提问by user3734568
I am new to Python, I am trying to read csv file using below script.
我是 Python 新手,我正在尝试使用以下脚本读取 csv 文件。
Past=pd.read_csv("C:/Users/Admin/Desktop/Python/Past.csv",encoding='utf-8')
But, getting error "UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 35: invalid start byte", Please help me to know issue here, I used encoding in script thought it will resolve error.
但是,收到错误“UnicodeDecodeError:'utf-8'编解码器无法解码位置 35 中的字节 0x96:无效起始字节”,请帮助我了解这里的问题,我在脚本中使用了编码,认为它会解决错误。
回答by Liam
This happens because you chose the wrong encoding.
发生这种情况是因为您选择了错误的编码。
If you are on Windows just replacing
如果你在 Windows 上只是更换
Past=pd.read_csv("C:/Users/Admin/Desktop/Python/Past.csv",encoding='utf-8')
with
和
Past=pd.read_csv("C:/Users/Admin/Desktop/Python/Past.csv",encoding='cp1252')
should solve the problem.
应该可以解决问题。
回答by Nitish Kumar Pal
Use this solution it will strip out (ignore) the characters and return the string without them. Only use this if your need is to strip them not convert them.
使用此解决方案,它将删除(忽略)字符并返回没有它们的字符串。仅当您需要剥离它们而不是转换它们时才使用它。
with open(path, encoding="utf8", errors='ignore') as f:
Using errors='ignore'
You'll just lose some characters. but if your don't care about them as they seem to be extra characters originating from a the bad formatting and programming of the clients connecting to my socket server. Then its a easy direct solution. reference
使用errors='ignore'
你只会丢失一些字符。但是如果您不关心它们,因为它们似乎是源自连接到我的套接字服务器的客户端的错误格式和编程的额外字符。然后它是一个简单的直接解决方案。参考
回答by ask_me
Try using :
尝试使用:
pd.read_csv(“Your filename", encoding="ISO-8859-1”)
pd.read_csv(“Your filename", encoding="ISO-8859-1”)
The code that I parsed from some website was converted in this encoding instead of default UTF-8 encoding which is standard.
我从某个网站解析的代码被转换为这种编码,而不是标准的默认 UTF-8 编码。
回答by Jason Goal
The following works very well for me:
以下对我来说非常有效:
encoding = 'latin1'
回答by Juba Fourali
Using the code bellow works for me:
使用下面的代码对我有用:
with open(keeniz_dir + '/world_cities.csv', 'r', encoding='latin1') as input: