Pandas 合并两个行数相同的数据集

Question

提问by stanedav

I have two tables with same number of rows (second table is computed from first one by processing of text inside T1). I have both of them stored as pandas dataframe. T2 is no common column with T1. This is example because my tables are huge:

我有两个行数相同的表（第二个表是通过处理 T1 中的文本从第一个表计算出来的）。我将它们都存储为Pandas数据框。T2 与 T1 没有共同的列。这是示例，因为我的表很大：

T1:
| name  | street  | city   |
|-------|---------|--------|
| David | street1 | Prague |
| John  | street2 | Berlin |
| Joe   | street3 | London |

T2:
| computed1 | computed2 |
|-----------|-----------|
| 0.5       | 0.3       |
| 0.2       | 0.8       |
| 0.1       | 0.6       |

Merged:
| name  | street  | city   | computed1 | computed2 |
|-------|---------|--------|-----------|-----------|
| David | street1 | Prague | 0.5       | 0.3       |
| John  | street2 | Berlin | 0.2       | 0.8       |
| Joe   | street3 | London | 0.1       | 0.6       |

I tried these commands:

我试过这些命令：

pd.concat([T1,T2])
pd.merge([T1,T2])
result=T1.join(T1)

With concat and merge I will get only first thousand combined and rest is filled with nan (I double checked that both are same size), and with .join it not combine them because there is nothing in common.

使用 concat 和 merge 我只会得到前一千个组合，其余的用 nan （我仔细检查了两者的大小相同），并且使用 .join 它不会组合它们，因为没有任何共同点。

Is there any way how to combine these two tables in pandas?

有什么办法可以在Pandas中组合这两个表吗？

Thanks

谢谢

Answer 1

回答by jezrael

You need reset_index()before concatfor default indices:

您需要reset_index()之前concat的默认索引：

df = pd.concat([T1.reset_index(drop=True),T2.reset_index(drop=Tru??e)], axis=1)

Answer 2

回答by MKJ

I want to add that pd.concat can do what you want by just providing the axis as columns. like this:

我想补充一点， pd.concat 可以通过将轴作为列提供来做你想做的事情。像这样：

pd.concat([T1,T2],axis=1)

Answer 3

回答by dubbbdan

Another way would be to merge on the index values:

另一种方法是合并索引值：

df = T1.reset_index().merge(T2.reset_index(), left_index=True, right_index=True, how='left)

Pandas 合并两个行数相同的数据集

提问by stanedav

回答by jezrael

回答by MKJ

回答by dubbbdan

相关推荐

最近更新

标签

Pandas 合并两个行数相同的数据集

提问by stanedav

回答by jezrael

回答by MKJ

回答by dubbbdan

相关推荐

pandas Seaborn pairplot ValueError：范围参数中的最大值必须大于最小值

pandas CountVectorizer 方法 get_feature_names() 生成代码但不生成单词

pandas 在协作中从驱动器加载 xlsx 文件

使用 Pandas 读取日志文件

相关推荐

最近更新

标签