pandas 熊猫在索引列上合并?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45889486/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:19:42  来源:igfitidea点击:

Pandas merge on index column?

pythonpandas

提问by DmitrySemenov

In [88]: c
Out[88]: 
                       Address    Name
CustomerID                            
10            Address for Mike    Mike
11          Address for Marcia  Marcia

In [89]: c.index
Out[89]: Int64Index([10, 11], dtype='int64', name='CustomerID')

In [90]: orders
Out[90]: 
   CustomerID   OrderDate
0          10  2014-12-01
1          11  2014-12-01
2          10  2014-12-01

In [91]: orders.index
Out[91]: RangeIndex(start=0, stop=3, step=1)

In [92]: c.merge(orders)
---------------------------
MergeError: No common columns to perform merge on

So panda can't merge if indexcolumn in one dataframehas the same name as another column in a second dataframe?

因此,如果一个数据框中的索引列与第二个数据框中的另一列同名,panda 无法合并?

回答by rojeeer

You need to explicitly specify how to join the table. By default, mergewill choose common column name as merge key. For your case,

您需要明确指定如何加入表。默认情况下,merge将选择公共列名作为合并键。对于你的情况,

c.merge(orders, left_index=True, right_on='CustomID')

Also, read the docs of pandas.DataFrame.mergeplease. Hope this would be helpful.

另外,pandas.DataFrame.merge请阅读请阅读的文档。希望这会有所帮助。

回答by Alexander

The joinmethod does a left join by default (how='left')and joins on the indices of the dataframes. So set the index of the ordersdataframe to CustomerIdand then join.

join方法做了左默认连接(how='left')并加入对dataframes的指标,因此设定的指标orders数据框来CustomerId,然后加入。

# Create sample data.
orders = pd.DataFrame(
    {'CustomerID': [10, 11, 10],
     'OrderDate': ['2014-12-01', '2014-12-01', '2014-12-01']})    
c = pd.DataFrame(
    {'Address': ['Address for Mike', 'Address for Marcia'], 
     'Name': ['Mike', 'Marcia']},
    index=pd.Index([10, 11], dtype='int64', name='CustomerID'))

# Join.
>>> c.join(orders.set_index('CustomerID'))
                       Address    Name   OrderDate
CustomerID                                        
10            Address for Mike    Mike  2014-12-01
10            Address for Mike    Mike  2014-12-01
11          Address for Marcia  Marcia  2014-12-01

Alternatively, this mergewill give you the same result. Here, you are joining on the index of c(the left dataframe) and on the CustomerIDcolumn in the right dataframe. Ensure to specify how='left'to only join items from the right dataframe to all of the records on the left (leaving an equivalent number of rows matching the length of c). The default behavior for mergeis an inner join, wherebe the result only includes those records from cthat find a match in orders(although this could be your desired result).

或者,这merge将为您提供相同的结果。在这里,您加入了c(左侧数据框)的索引和CustomerID右侧数据框的列。确保指定how='left'仅将右侧数据框中的项目连接到左侧的所有记录(留下与 长度匹配的等效行数c)。的默认行为merge是内部联接,其中结果仅包括从中c找到匹配项的那些记录orders(尽管这可能是您想要的结果)。

c.merge(orders, left_index=True, right_on='CustomerID', how='left')

回答by Kyle

Try resetting the index:

尝试重置索引:

c.reset_index().merge(orders)