Python 将熊猫数据框列表连接在一起

Question

提问by Whitebeard

I have a list of Pandas dataframes that I would like to combine into one Pandas dataframe. I am using Python 2.7.10 and Pandas 0.16.2

我有一个 Pandas 数据框的列表，我想将它们组合成一个 Pandas 数据框。我正在使用 Python 2.7.10 和 Pandas 0.16.2

I created the list of dataframes from:

我从以下位置创建了数据框列表：

import pandas as pd
dfs = []
sqlall = "select * from mytable"

for chunk in pd.read_sql_query(sqlall , cnxn, chunksize=10000):
    dfs.append(chunk)

This returns a list of dataframes

这将返回数据帧列表

type(dfs[0])
Out[6]: pandas.core.frame.DataFrame

type(dfs)
Out[7]: list

len(dfs)
Out[8]: 408

Here is some sample data

这是一些示例数据

# sample dataframes
d1 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
d2 = pd.DataFrame({'one' : [5., 6., 7., 8.], 'two' : [9., 10., 11., 12.]})
d3 = pd.DataFrame({'one' : [15., 16., 17., 18.], 'two' : [19., 10., 11., 12.]})

# list of dataframes
mydfs = [d1, d2, d3]

I would like to combine d1, d2, and d3into one pandas dataframe. Alternatively, a method of reading a large-ish table directly into a dataframe when using the chunksizeoption would be very helpful.

我想将d1,d2和组合d3成一个熊猫数据框。或者，在使用该chunksize选项时将大型表直接读入数据帧的方法将非常有帮助。

Answer 1

采纳答案by DeepSpace

Given that all the dataframes have the same columns, you can simply concatthem:

鉴于所有数据框都具有相同的列，您可以简单地使用concat它们：

import pandas as pd
df = pd.concat(list_of_dataframes)

Answer 2

回答by meyerson

If the dataframes DO NOT all have the same columns try the following:

如果数据框不都具有相同的列，请尝试以下操作：

df = pd.DataFrame.from_dict(map(dict,df_list))

Answer 3

回答by Jay Wong

You also can do it with functional programming:

你也可以用函数式编程来做到这一点：

reduce(lambda df1, df2: df1.merge(df2, "outer"), mydfs)

Answer 4

回答by Lelouch

concatalso works nicely with a list comprehension pulled using the "loc" command against an existing dataframe

concat也可以很好地与使用“loc”命令针对现有数据框拉出的列表理解一起使用

df = pd.read_csv('./data.csv') # ie; Dataframe pulled from csv file with a "userID" column

review_ids = ['1','2','3'] # ie; ID values to grab from DataFrame

# Gets rows in df where IDs match in the userID column and combines them 

dfa = pd.concat([df.loc[df['userID'] == x] for x in review_ids])

Python 将熊猫数据框列表连接在一起

提问by Whitebeard

采纳答案by DeepSpace

回答by meyerson

回答by Jay Wong

回答by Lelouch

相关推荐

最近更新

标签

Python 将熊猫数据框列表连接在一起

提问by Whitebeard

采纳答案by DeepSpace

回答by meyerson

回答by Jay Wong

回答by Lelouch

相关推荐

不在 matplotlib 中绘制“零”或将零更改为无 [Python]

在 iPython notebook 中调试的正确方法是什么？

Python AttributeError: 'str' 对象没有属性 'write'

while 循环和小于或等于符号 (Python)

相关推荐

最近更新

标签