Python 基于pandas中的多个键合并两个DataFrame
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32277473/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Merge two DataFrames based on multiple keys in pandas
提问by Surah Li
Does pandas (or another module) have any functions to support merge (or join) two tables based on multiple keys?
pandas(或其他模块)是否具有支持基于多个键合并(或连接)两个表的功能?
For example, I have two tables (DataFrames) a
and b
:
例如,我有两个表(DataFrames)a
和b
:
>>> a
A B value1
1 1 23
1 2 34
2 1 2342
2 2 333
>>> b
A B value2
1 1 0.10
1 2 0.20
2 1 0.13
2 2 0.33
The desired result is:
想要的结果是:
A B value1 value2
1 1 23 0.10
1 2 34 0.20
2 1 2342 0.13
2 2 333 0.33
回答by Alex Riley
To merge by multiple keys, you just need to pass the keys in a list to pd.merge
:
要通过多个键合并,您只需将列表中的键传递给pd.merge
:
>>> pd.merge(a, b, on=['A', 'B'])
A B value1 value2
0 1 1 23 0.10
1 1 2 34 0.20
2 2 1 2342 0.13
3 2 2 333 0.33
In fact, the default for pd.merge
is to use the intersection of the two DataFrames' column labels, so pd.merge(a, b)
would work equally well in this case.
事实上,默认为pd.merge
使用两个 DataFrame 的列标签的交集,因此pd.merge(a, b)
在这种情况下同样有效。
回答by Miguel Rueda
According to the most recent pandas documentation the onparameter accepts a label or list of field name, and both must be found in both data frames. Here is a MWE for its use:
根据最新的熊猫文档,on参数接受一个标签或字段名称列表,并且必须在两个数据框中都找到。这是一个 MWE 用于它的用途:
a = pd.DataFrame({'A':['0', '0', '1','1'],'B':['0', '1', '0','1'], 'v':True, False, False, True]})
b = pd.DataFrame({'A':['0', '0', '1','1'], 'B':['0', '1', '0','1'],'v':[False, True, True, True]})
result = pd.merge(a, b, on=['A','B'], how='inner', suffixes=['_and', '_or'])
>>> result
A B v_and v_or
0 0 0 True False
1 0 1 False True
2 1 0 False True
3 1 1 True True
on : label or list Column or index level names to join on. These must be found in both DataFrames. If on is None and not merging on indexes then this defaults to the intersection of the columns in both DataFrames.
on :标签或列表要加入的列或索引级别名称。这些必须在两个 DataFrame 中都能找到。如果 on 为 None 并且不合并索引,则默认为两个 DataFrame 中列的交集。
Check out latest pd.mergedocumentation for further details.
查看最新的pd.merge文档以获取更多详细信息。