pandas 如何从pickle文件中获取数据到pandas数据框中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40180737/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to get data from pickle files into a pandas dataframe
提问by Andrew Smith
I'm working on a social media sentiment analysis for a class. I have gotten all of the tweets about the Kentucky Derby for a 2 month period saved into pkl files.
我正在为一个班级进行社交媒体情绪分析。我已经将所有关于肯塔基德比的推文保存到 pkl 文件中,为期 2 个月。
My question is: how do I get all of these pickle dump files loaded into a dataframe?
我的问题是:如何将所有这些泡菜转储文件加载到数据框中?
Here is my code:
这是我的代码:
import sklearn as sk
import pandas as pd
import got3
def daterange(start_date, end_date):
for n in range(int ((end_date - start_date).days)):
yield start_date + timedelta(n)
start_date = date(2016, 3, 31)
end_date = date(2016, 6, 1)
dates = []
for single_date in daterange(start_date, end_date):
dates.append(single_date.strftime("%Y-%m-%d"))
for i in range(len(dates)-1):
this_date = dates[i]
tomorrow_date = dates[i+1]
print("Getting tweets for " + tomorrow_date)
tweetCriteria = got3.manager.TweetCriteria()
tweetCriteria.setQuerySearch("Kentucky Derby")
tweetCriteria.setQuerySearch("KYDerby")
tweetCriteria.setSince(this_date)
tweetCriteria.setUntil(tomorrow_date)
Kentucky_Derby_tweets = got3.manager.TweetManager.getTweets(tweetCriteria)
pkl.dump(Kentucky_Derby_tweets, open(tomorrow_date + ".pkl", "wb"))
回答by simon
You can use
您可以使用
pd.read_pickle(filename)
- add it to a list
- then
pd.concat(thelist)
pd.read_pickle(filename)
- 将其添加到列表中
- 然后
pd.concat(thelist)