Python 将熊猫数据框列表连接在一起
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32444138/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Concatenate a list of pandas dataframes together
提问by Whitebeard
I have a list of Pandas dataframes that I would like to combine into one Pandas dataframe. I am using Python 2.7.10 and Pandas 0.16.2
我有一个 Pandas 数据框的列表,我想将它们组合成一个 Pandas 数据框。我正在使用 Python 2.7.10 和 Pandas 0.16.2
I created the list of dataframes from:
我从以下位置创建了数据框列表:
import pandas as pd
dfs = []
sqlall = "select * from mytable"
for chunk in pd.read_sql_query(sqlall , cnxn, chunksize=10000):
dfs.append(chunk)
This returns a list of dataframes
这将返回数据帧列表
type(dfs[0])
Out[6]: pandas.core.frame.DataFrame
type(dfs)
Out[7]: list
len(dfs)
Out[8]: 408
Here is some sample data
这是一些示例数据
# sample dataframes
d1 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
d2 = pd.DataFrame({'one' : [5., 6., 7., 8.], 'two' : [9., 10., 11., 12.]})
d3 = pd.DataFrame({'one' : [15., 16., 17., 18.], 'two' : [19., 10., 11., 12.]})
# list of dataframes
mydfs = [d1, d2, d3]
I would like to combine d1
, d2
, and d3
into one pandas dataframe. Alternatively, a method of reading a large-ish table directly into a dataframe when using the chunksize
option would be very helpful.
我想将d1
,d2
和组合d3
成一个熊猫数据框。或者,在使用该chunksize
选项时将大型表直接读入数据帧的方法将非常有帮助。
采纳答案by DeepSpace
Given that all the dataframes have the same columns, you can simply concat
them:
鉴于所有数据框都具有相同的列,您可以简单地使用concat
它们:
import pandas as pd
df = pd.concat(list_of_dataframes)
回答by meyerson
If the dataframes DO NOT all have the same columns try the following:
如果数据框不都具有相同的列,请尝试以下操作:
df = pd.DataFrame.from_dict(map(dict,df_list))
回答by Jay Wong
You also can do it with functional programming:
你也可以用函数式编程来做到这一点:
reduce(lambda df1, df2: df1.merge(df2, "outer"), mydfs)
回答by Lelouch
concat
also works nicely with a list comprehension pulled using the "loc" command against an existing dataframe
concat
也可以很好地与使用“loc”命令针对现有数据框拉出的列表理解一起使用
df = pd.read_csv('./data.csv') # ie; Dataframe pulled from csv file with a "userID" column
review_ids = ['1','2','3'] # ie; ID values to grab from DataFrame
# Gets rows in df where IDs match in the userID column and combines them
dfa = pd.concat([df.loc[df['userID'] == x] for x in review_ids])