pandas 熊猫在索引列上合并?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45889486/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas merge on index column?
提问by DmitrySemenov
In [88]: c
Out[88]:
Address Name
CustomerID
10 Address for Mike Mike
11 Address for Marcia Marcia
In [89]: c.index
Out[89]: Int64Index([10, 11], dtype='int64', name='CustomerID')
In [90]: orders
Out[90]:
CustomerID OrderDate
0 10 2014-12-01
1 11 2014-12-01
2 10 2014-12-01
In [91]: orders.index
Out[91]: RangeIndex(start=0, stop=3, step=1)
In [92]: c.merge(orders)
---------------------------
MergeError: No common columns to perform merge on
So panda can't merge if indexcolumn in one dataframehas the same name as another column in a second dataframe?
因此,如果一个数据框中的索引列与第二个数据框中的另一列同名,panda 无法合并?
回答by rojeeer
You need to explicitly specify how to join the table. By default, merge
will choose common column name as merge key. For your case,
您需要明确指定如何加入表。默认情况下,merge
将选择公共列名作为合并键。对于你的情况,
c.merge(orders, left_index=True, right_on='CustomID')
Also, read the docs of pandas.DataFrame.merge
please.
Hope this would be helpful.
另外,pandas.DataFrame.merge
请阅读请阅读的文档。希望这会有所帮助。
回答by Alexander
The join
method does a left join by default (how='left')
and joins on the indices of the dataframes. So set the index of the orders
dataframe to CustomerId
and then join.
该join
方法做了左默认连接(how='left')
并加入对dataframes的指标,因此设定的指标orders
数据框来CustomerId
,然后加入。
# Create sample data.
orders = pd.DataFrame(
{'CustomerID': [10, 11, 10],
'OrderDate': ['2014-12-01', '2014-12-01', '2014-12-01']})
c = pd.DataFrame(
{'Address': ['Address for Mike', 'Address for Marcia'],
'Name': ['Mike', 'Marcia']},
index=pd.Index([10, 11], dtype='int64', name='CustomerID'))
# Join.
>>> c.join(orders.set_index('CustomerID'))
Address Name OrderDate
CustomerID
10 Address for Mike Mike 2014-12-01
10 Address for Mike Mike 2014-12-01
11 Address for Marcia Marcia 2014-12-01
Alternatively, this merge
will give you the same result. Here, you are joining on the index of c
(the left dataframe) and on the CustomerID
column in the right dataframe. Ensure to specify how='left'
to only join items from the right dataframe to all of the records on the left (leaving an equivalent number of rows matching the length of c
). The default behavior for merge
is an inner join, wherebe the result only includes those records from c
that find a match in orders
(although this could be your desired result).
或者,这merge
将为您提供相同的结果。在这里,您加入了c
(左侧数据框)的索引和CustomerID
右侧数据框的列。确保指定how='left'
仅将右侧数据框中的项目连接到左侧的所有记录(留下与 长度匹配的等效行数c
)。的默认行为merge
是内部联接,其中结果仅包括从中c
找到匹配项的那些记录orders
(尽管这可能是您想要的结果)。
c.merge(orders, left_index=True, right_on='CustomerID', how='left')
回答by Kyle
Try resetting the index:
尝试重置索引:
c.reset_index().merge(orders)