如何将 pandas isin 用于多列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45198786/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:02:42  来源:igfitidea点击:

how to use pandas isin for multiple columns

pythonpandas

提问by Jun Jang

enter image description here

在此处输入图片说明

enter image description here

在此处输入图片说明

enter image description here

在此处输入图片说明

I want to find the values of col1and col2where the col1and col2of the first dataframe are both in the second dataframe.

我想找到的值col1,并col2col1col2第一个数据帧的都是在第二个数据帧。

These rows should be in the result dataframe:

这些行应该在结果数据框中:

  1. pizza, boy

  2. pizza, girl

  3. ice cream, boy

  1. 披萨,男孩

  2. 披萨,女孩

  3. 冰淇淋,男孩

because all three rows are in the first and second dataframes.

因为所有三行都在第一个和第二个数据帧中。

How do I possibly accomplish this? I was thinking of using isin, but I am not sure how to use it when I have to consider more than one column.

我怎么可能做到这一点?我正在考虑使用isin,但是当我必须考虑不止一列时,我不确定如何使用它。

回答by unutbu

Perform an inner mergeon col1and col2:

在和上执行内部合并col1col2

import pandas as pd
df1 = pd.DataFrame({'col1': ['pizza', 'hamburger', 'hamburger', 'pizza', 'ice cream'], 'col2': ['boy', 'boy', 'girl', 'girl', 'boy']}, index=range(1,6))
df2 = pd.DataFrame({'col1': ['pizza', 'pizza', 'chicken', 'cake', 'cake', 'chicken', 'ice cream'], 'col2': ['boy', 'girl', 'girl', 'boy', 'girl', 'boy', 'boy']}, index=range(10,17))

print(pd.merge(df2.reset_index(), df1, how='inner').set_index('index'))

yields

产量

            col1  col2
index                 
10         pizza   boy
11         pizza  girl
16     ice cream   boy

The purpose of the reset_indexand set_indexcalls are to preserve df2's index as in the desired result you posted. If the index is not important, then

reset_indexset_index调用的目的是在df2您发布的所需结果中保留的索引。如果索引不重要,那么

pd.merge(df2, df1, how='inner')
#         col1  col2
# 0      pizza   boy
# 1      pizza  girl
# 2  ice cream   boy

would suffice.

就足够了。



Alternatively, you could construct MultiIndexsout of the col1and col2columns, and then call the MultiIndex.isinmethod:

或者,您可以从和列中构造MultiIndexs,然后调用方法col1col2MultiIndex.isin

index1 = pd.MultiIndex.from_arrays([df1[col] for col in ['col1', 'col2']])
index2 = pd.MultiIndex.from_arrays([df2[col] for col in ['col1', 'col2']])
print(df2.loc[index2.isin(index1)])

yields

产量

         col1  col2
10      pizza   boy
11      pizza  girl
16  ice cream   boy

回答by Ningrong Ye

Thank you unutbu! Here is a little update.

谢谢你!这是一个小更新。

import pandas as pd
df1 = pd.DataFrame({'col1': ['pizza', 'hamburger', 'hamburger', 'pizza', 'ice cream'], 'col2': ['boy', 'boy', 'girl', 'girl', 'boy']}, index=range(1,6))
df2 = pd.DataFrame({'col1': ['pizza', 'pizza', 'chicken', 'cake', 'cake', 'chicken', 'ice cream'], 'col2': ['boy', 'girl', 'girl', 'boy', 'girl', 'boy', 'boy']}, index=range(10,17))
df1[df1.set_index(['col1','col2']).index.isin(df2.set_index(['col1','col2']).index)]

return:

返回:

    col1    col2
1   pizza   boy
4   pizza   girl
5   ice cream   boy

回答by u9628793

If somehow you must stick to isinor the negate version ~isin. You may first create a new column, with the concatenation of col1, col2. Then use isinto filter your data. Here is the code:

如果不知何故你必须坚持isin或否定版本~isin。您可以先创建一个新列,并连接col1, col2。然后用于isin过滤您的数据。这是代码:

import pandas as pd
df1 = pd.DataFrame({'col1': ['pizza', 'hamburger', 'hamburger', 'pizza', 'ice cream'], 'col2': ['boy', 'boy', 'girl', 'girl', 'boy']}, index=range(1,6))
df2 = pd.DataFrame({'col1': ['pizza', 'pizza', 'chicken', 'cake', 'cake', 'chicken', 'ice cream'], 'col2': ['boy', 'girl', 'girl', 'boy', 'girl', 'boy', 'boy']}, index=range(10,17))

df1['indicator'] = df1['col1'].str.cat(df1['col2'])
df2['indicator'] = df2['col1'].str.cat(df2['col2'])

df2.loc[df2['indicator'].isin(df1['indicator'])].drop(columns=['indicator'])

which gives

这使


    col1    col2
10  pizza   boy
11  pizza   girl
16  ice cream   boy