pandas 检查列值是否在熊猫的其他列中
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43093394/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Check if column value is in other columns in pandas
提问by Amyunimus
I have the following dataframe in pandas
我在Pandas中有以下数据框
target A B C
0 cat bridge cat brush
1 brush dog cat shoe
2 bridge cat shoe bridge
How do I test whether df.target
is in any of the columns ['A','B','C', etc.]
, where there are many columns to check?
如何测试是否df.target
在任何['A','B','C', etc.]
要检查的列中?
I have tried merging A,B and C into a string to use df.abcstring.str.contains(df.target)
but this does not work.
我曾尝试将 A、B 和 C 合并为一个字符串来使用,df.abcstring.str.contains(df.target)
但这不起作用。
回答by pansen
You can use drop
, isin
and any
.
drop
thetarget
column to have a df with yourA
,B
,C
columns only- check if the values
isin
the target column - and check if
any
hits are present
drop
该target
列有一个DF与你A
,B
,C
仅列- 检查
isin
目标列的值 - 并检查是否
any
存在命中
That's it.
就是这样。
df["exists"] = df.drop("target", 1).isin(df["target"]).any(1)
print(df)
target A B C exists
0 cat bridge cat brush True
1 brush dog cat shoe False
2 bridge cat shoe bridge True
回答by MaxU
OneHotEncoder approach:
OneHotEncoder 方法:
In [165]: x = pd.get_dummies(df.drop('target',1), prefix='', prefix_sep='')
In [166]: x
Out[166]:
bridge cat dog cat shoe bridge brush shoe
0 1 0 0 1 0 0 1 0
1 0 0 1 1 0 0 0 1
2 0 1 0 0 1 1 0 0
In [167]: x[df['target']].eq(1).any(1)
Out[167]:
0 True
1 True
2 True
dtype: bool
Explanation:
解释:
In [168]: x[df['target']]
Out[168]:
cat cat brush bridge bridge
0 0 1 1 1 0
1 0 1 0 0 0
2 1 0 0 0 1
回答by jezrael
You can use eq
, for drop column pop
if neech check by rows:
mask = df.eq(df.pop('target'), axis=0)
print (mask)
A B C
0 False True False
1 False False False
2 False False True
And then if need check at least one True
add any
:
然后如果需要检查至少一个True
添加any
:
mask = df.eq(df.pop('target'), axis=0).any(axis=1)
print (mask)
0 True
1 False
2 True
dtype: bool
df['new'] = df.eq(df.pop('target'), axis=0).any(axis=1)
print (df)
A B C new
0 bridge cat brush True
1 dog cat shoe False
2 cat shoe bridge True
But if need check all values in column use isin
:
但是如果需要检查列中的所有值,请使用isin
:
mask = df.isin(df.pop('target').values.tolist())
print (mask)
A B C
0 True True True
1 False True False
2 True False True
And if want check if all values are True
add all
:
如果要检查是否所有值都True
添加all
:
df['new'] = df.isin(df.pop('target').values.tolist()).all(axis=1)
print (df)
A B C new
0 bridge cat brush True
1 dog cat shoe False
2 cat shoe bridge False
回答by AndreyF
you can use apply a function for each row that counts the number of value that match the value in the 'target' column:
您可以为每一行应用一个函数,该函数计算与“目标”列中的值匹配的值的数量:
df["exist"] = df.apply(lambda row:row.value_counts()[row['target']] > 1 , axis=1)
for a dataframe that looks like:
对于如下所示的数据框:
b c target
0 3 a a
1 3 4 2
2 3 4 2
3 3 4 2
4 3 4 4
the output will be:
输出将是:
b c target exist
0 3 a a True
1 3 4 2 False
2 3 4 2 False
3 3 4 2 False
4 3 4 4 True
回答by FLab
Another approach using index differencemethod:
另一种使用索引差异法的方法:
matches = df[df.columns.difference(['target'])].eq(df['target'], axis = 0)
# A B C
#0 False True False
#1 False False False
#2 False False True
# Check if at least one match:
matches.any(axis = 1)
#Out[30]:
#0 True
#1 False
#2 True
In case you wanted to see which columns meet the target, here is a possible solution:
如果您想查看哪些列符合目标,这里有一个可能的解决方案:
matches.apply(lambda x: ", ".join(x.index[np.where(x.tolist())]), axis = 1)
Out[53]:
0 B
1
2 C
dtype: object