pandas 在两个不同的文件中转储和加载莳萝(泡菜)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38447815/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
dump and load a dill (pickle) in two different files
提问by achimneyswallow
I think this is fundamental to many people who know how to deal with pickle. However, I still can't get it very right after trying for a few hours. I have the following code:
我认为这对许多知道如何处理泡菜的人来说是基础。但是,我尝试了几个小时后仍然无法完全正确。我有以下代码:
In the first file
在第一个文件中
import pandas as pd
names = ["John", "Mary", "Mary", "Suzanne", "John", "Suzanne"]
scores = [80, 90, 90, 92, 95, 100]
records = pd.DataFrame({"name": names, "score": scores})
means = records.groupby('name').mean()
def name_score_function(record):
if record in names:
return(means.loc[record, 'score'])
import dill as pickle
with open('name_model.pkl', 'wb') as file:
pickle.dump(means, file)
The second file
第二个文件
I would like to load what I have in the first file and make the score of a person (i.e. John, Mary, Suzanne) callable via a function name_model(record):
我想加载第一个文件中的内容,并通过函数 name_model(record) 使一个人(即约翰、玛丽、苏珊娜)的分数可调用:
import dill as pickle
B = pickle.load('name_model.pkl')
def name_model(record):
if record in names:
return(means.loc[record, 'score'])
Here it shows the error:
这里显示错误:
File "names.py", line 21, in <module>
B = pickle.load('name_model.pkl')
File "/opt/conda/lib/python2.7/site-packages/dill/dill.py", line 197, in load
pik = Unpickler(file)
File "/opt/conda/lib/python2.7/site-packages/dill/dill.py", line 356, in __init__
StockUnpickler.__init__(self, *args, **kwds)
File "/opt/conda/lib/python2.7/pickle.py", line 847, in __init__
self.readline = file.readline
AttributeError: 'str' object has no attribute 'readline'
I know the error comes from my lack of understanding of pickle. I would humbly accept your opinions to improve this code. Thank you!!
我知道错误来自我对泡菜缺乏了解。我虚心接受您的意见以改进此代码。谢谢!!
UPDATEThe more specific thing I would like to achieve:
更新我想实现的更具体的事情:
I would like to be able to use the function that I write in the first file and dump it, and then read it in the second file and be able to use this function to query the mean score of any person in the records.
我希望能够使用我在第一个文件中编写的函数并将其转储,然后在第二个文件中读取它并能够使用该函数查询记录中任何人的平均分数。
Here is what I have:
这是我所拥有的:
import pandas as pd
names = ["John", "Mary", "Mary", "Suzanne", "John", "Suzanne"]
scores = [80, 90, 90, 92, 95, 100]
records = pd.DataFrame({"name": names, "score": scores})
means = records.groupby('name').mean()
def name_score_function(record):
if record in names:
return(means.loc[record, 'score'])
B = name_score_function(record)
import dill as pickle
with open('name_model.pkl', 'wb') as file:
pickle.dump(B, file)
with open('name_model.pkl', 'rb') as file:
B = pickle.load(f)
def name_model(record):
return B(record)
print(name_model("John"))
As I execute this code, I have this error File "test.py", line 13, in <module>
B = name_score_function(record)
NameError: name 'record' is not defined
当我执行此代码时,出现此错误 File "test.py", line 13, in <module>
B = name_score_function(record)
NameError: name 'record' is not defined
I highly appreciate your assistance and patience.
我非常感谢您的帮助和耐心。
回答by achimneyswallow
Thank you. It looks like the following can solve the problem.
谢谢你。看起来以下内容可以解决问题。
import pandas as pd
names = ["John", "Mary", "Mary", "Suzanne", "John", "Suzanne"]
scores = [80, 90, 90, 92, 95, 100]
records = pd.DataFrame({"name": names, "score": scores})
means = records.groupby('name').mean()
import dill as pickle
with open('name_model.pkl', 'wb') as file:
pickle.dump(means, file)
with open('name_model.pkl', 'rb') as file:
B = pickle.load(file)
def name_score_function(record):
if record in names:
return(means.loc[record, 'score'])
print(name_score_function("John"))
回答by eafit
Hmm. you need to read it the same way you wrote it -- nesting it inside an open clause:
唔。你需要像你写的一样阅读它——将它嵌套在一个开放子句中:
import dill as pickle
with open('name_model.pkl' ,'rb') as f:
B = pickle.load(f)