Pandas 等价的 rbind 操作

Question

提问by TTT

Basically, I am looping through a bunch of CSV files and in the end would like to appendeach dataframe into one. Actually, all I need is an rbindtype function. So, I did some search and followed the guide. However, I still could not get the ideal solution.

基本上，我正在遍历一堆 CSV 文件，最后希望将append每个数据帧合并为一个。实际上，我只需要一个rbind类型函数。因此，我进行了一些搜索并按照指南进行了操作。但是，我仍然无法得到理想的解决方案。

A sample code is attached below. For instance shape of data1 is always 47 by 42. But shape of data_out_finalbecomes (47, 42), (47, 84), and (47, 126) after the first three files. Idealy, it should be (141, 42). In addition, I check index of data1, which is RangeIndex(start=0, stop=47, step=1). Appreciate any suggestions!

下面附上示例代码。例如，data1 的形状总是 47 x 42。但是data_out_final在前三个文件之后，形状变为 (47, 42)、(47, 84) 和 (47, 126)。理想情况下，它应该是 (141, 42)。此外，我检查了的索引data1，即RangeIndex(start=0, stop=47, step=1)。感谢任何建议！

My pandasversion is 0.18.1

我的pandas版本是0.18.1

code

代码

appended_data = []
for csv_each in csv_pool:
    data1 = pd.read_csv(csv_each, header=0)
    # do something here
    appended_data.append(data2) 
data_out_final = pd.concat(appended_data, axis=1)

If using data_out_final = pd.concat(appended_data, axis=1), shape of data_out_final becomes (141, 94)

如果使用data_out_final = pd.concat(appended_data, axis=1)，则 data_out_final 的形状变为 (141, 94)

PS

聚苯乙烯

kind of figure it out. Actually, you have to standardize column names before pd.concat.

有点想通了。实际上，您必须在pd.concat.

Answer 1

回答by Asish M.

>>> df1
          a         b
0 -1.417866 -0.828749
1  0.212349  0.791048
2 -0.451170  0.628584
3  0.612671 -0.995330
4  0.078460 -0.322976
5  1.244803  1.576373
6  1.169629 -1.135926
7 -0.652443  0.506388
8  0.549604 -0.691054
9 -0.512829 -0.959398

>>> df2
          a         b
0 -0.652161  0.940932
1  2.495067  0.004833
2 -2.187792  1.692402
3  1.900738  0.372425
4  0.245976  1.894527
5  0.627297  0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8  0.543389  0.703457
9 -0.755059  1.239968

>>> pd.concat([df1, df2])
          a         b
0 -1.417866 -0.828749
1  0.212349  0.791048
2 -0.451170  0.628584
3  0.612671 -0.995330
4  0.078460 -0.322976
5  1.244803  1.576373
6  1.169629 -1.135926
7 -0.652443  0.506388
8  0.549604 -0.691054
9 -0.512829 -0.959398
0 -0.652161  0.940932
1  2.495067  0.004833
2 -2.187792  1.692402
3  1.900738  0.372425
4  0.245976  1.894527
5  0.627297  0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8  0.543389  0.703457
9 -0.755059  1.239968

Unless I'm misinterpreting what you need, this is what you need.

除非我误解了您的需要，否则这就是您所需要的。

Answer 2

回答by Jon

Try: http://pandas.pydata.org/pandas-docs/stable/10min.html?highlight=concat#concat

试试：http: //pandas.pydata.org/pandas-docs/stable/10min.html?highlight=concat#concat

"pandas provides various facilities for easily combining together Series, DataFrame, and Panel objects with various kinds of set logic for the indexes and relational algebra functionality in the case of join / merge-type operations."

“在连接/合并类型操作的情况下，pandas 提供了各种工具，可以轻松地将 Series、DataFrame 和 Panel 对象与索引和关系代数功能的各种集合逻辑组合在一起。”

Pandas 等价的 rbind 操作

提问by TTT

code

代码

PS

聚苯乙烯

回答by Asish M.

回答by Jon

相关推荐

最近更新

标签

Pandas 等价的 rbind 操作

提问by TTT

code

代码

PS

聚苯乙烯

回答by Asish M.

回答by Jon

相关推荐

pandas 熊猫没有过滤条件

pandas 如何从 Bokeh ColumnDatasource 中提取数据

pandas 将数据帧输出到 json 数组

Pandas - 关于应用功能缓慢的解释

相关推荐

最近更新

标签