Python 泡菜错误:UnicodeDecodeError
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32957708/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python pickle error: UnicodeDecodeError
提问by 90abyss
I'm trying to do some text classification using Textblob. I'm first training the model and serializing it using pickle as shown below.
我正在尝试使用 Textblob 进行一些文本分类。我首先训练模型并使用pickle对其进行序列化,如下所示。
import pickle
from textblob.classifiers import NaiveBayesClassifier
with open('sample.csv', 'r') as fp:
cl = NaiveBayesClassifier(fp, format="csv")
f = open('sample_classifier.pickle', 'wb')
pickle.dump(cl, f)
f.close()
And when I try to run this file:
当我尝试运行此文件时:
import pickle
f = open('sample_classifier.pickle', encoding="utf8")
cl = pickle.load(f)
f.close()
I get this error:
我收到此错误:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 0 中的字节 0x80:起始字节无效
Following are the content of my sample.csv:
以下是我的 sample.csv 的内容:
My SQL is not working correctly at all. This was a wrong choice, SQL
I've issues. Please respond immediately, Support
我的 SQL 根本无法正常工作。这是一个错误的选择,SQL
我有问题。请立即回复,支持
Where am I going wrong here? Please help.
我哪里出错了?请帮忙。
采纳答案by donkopotamus
By choosing to open
the file in mode wb
, you are choosing to write in raw binary. There is no character encoding being applied.
通过选择open
mode 中的文件wb
,您选择以原始二进制文件写入。没有应用字符编码。
Thus to read this file, you should simply open
in mode rb
.
因此要读取这个文件,你应该简单地open
在 mode rb
。
回答by saulspatz
I think you should open the file as
我认为你应该打开文件
f = open('sample_classifier.pickle', 'rb')
cl = pickle.load(f)
You shouldn't have to decode it. pickle.load
will give you an exact copy of whatever it is you saved. At this point you, should be able to work with cl
as if you just created it.
你不应该解码它。 pickle.load
将为您提供您保存的任何内容的精确副本。此时,您应该可以cl
像刚刚创建它一样使用它。