Pandas 合并两个行数相同的数据集
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47655296/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas merge two datasets with same number of rows
提问by stanedav
I have two tables with same number of rows (second table is computed from first one by processing of text inside T1). I have both of them stored as pandas dataframe. T2 is no common column with T1. This is example because my tables are huge:
我有两个行数相同的表(第二个表是通过处理 T1 中的文本从第一个表计算出来的)。我将它们都存储为Pandas数据框。T2 与 T1 没有共同的列。这是示例,因为我的表很大:
T1:
| name | street | city |
|-------|---------|--------|
| David | street1 | Prague |
| John | street2 | Berlin |
| Joe | street3 | London |
T2:
| computed1 | computed2 |
|-----------|-----------|
| 0.5 | 0.3 |
| 0.2 | 0.8 |
| 0.1 | 0.6 |
Merged:
| name | street | city | computed1 | computed2 |
|-------|---------|--------|-----------|-----------|
| David | street1 | Prague | 0.5 | 0.3 |
| John | street2 | Berlin | 0.2 | 0.8 |
| Joe | street3 | London | 0.1 | 0.6 |
I tried these commands:
我试过这些命令:
pd.concat([T1,T2])
pd.merge([T1,T2])
result=T1.join(T1)
With concat and merge I will get only first thousand combined and rest is filled with nan (I double checked that both are same size), and with .join it not combine them because there is nothing in common.
使用 concat 和 merge 我只会得到前一千个组合,其余的用 nan (我仔细检查了两者的大小相同),并且使用 .join 它不会组合它们,因为没有任何共同点。
Is there any way how to combine these two tables in pandas?
有什么办法可以在Pandas中组合这两个表吗?
Thanks
谢谢
回答by jezrael
You need reset_index()
before concat
for default indices:
您需要reset_index()
之前concat
的默认索引:
df = pd.concat([T1.reset_index(drop=True),T2.reset_index(drop=Tru??e)], axis=1)
回答by MKJ
I want to add that pd.concat can do what you want by just providing the axis as columns. like this:
我想补充一点, pd.concat 可以通过将轴作为列提供来做你想做的事情。像这样:
pd.concat([T1,T2],axis=1)
回答by dubbbdan
Another way would be to merge on the index values:
另一种方法是合并索引值:
df = T1.reset_index().merge(T2.reset_index(), left_index=True, right_index=True, how='left)