pandas 基于 Python 中的另一个数据框选择数据框的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/54006298/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Select rows of a dataframe based on another dataframe in Python
提问by ahbon
I have the following dataframe:
我有以下数据框:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8), 'D': np.arange(8) * 2})
print(df1)
A B C D
0 foo one 0 0
1 bar one 1 2
2 foo two 2 4
3 bar three 3 6
4 foo two 4 8
5 bar two 5 10
6 foo one 6 12
7 foo three 7 14
I hope to select rows in df1 by the df2 as follows:
我希望通过 df2 选择 df1 中的行,如下所示:
df2 = pd.DataFrame({'A': 'foo bar'.split(),
'B': 'one two'.split()
})
print(df2)
A B
0 foo one
1 bar two
Here is what I have tried in Python, but I just wonder if there is another method. Thanks.
这是我在 Python 中尝试过的,但我只是想知道是否有另一种方法。谢谢。
df = df1.merge(df2, on=['A','B'])
print(df)
Here is the outputs expected.
这是预期的输出。
A B C D
0 foo one 0 0
1 bar two 5 10
2 foo one 6 12
Using pandas to select rows using two different columns from dataframe?
回答by jezrael
Simpliest is use merge
with inner join.
最简单的是merge
与内部联接一起使用。
Another solution with filtering:
过滤的另一种解决方案:
arr = [np.array([df1[k] == v for k, v in x.items()]).all(axis=0) for x in df2.to_dict('r')]
df = df1[np.array(arr).any(axis=0)]
print(df)
A B C D
0 foo one 0 0
5 bar two 5 10
6 foo one 6 12
Or create MultiIndex
and filter with Index.isin
:
或者使用以下命令创建MultiIndex
和过滤Index.isin
:
df = df1[df1.set_index(['A','B']).index.isin(df2.set_index(['A','B']).index)]
print(df)
A B C D
0 foo one 0 0
5 bar two 5 10
6 foo one 6 12