pandas 查找多个数据框列之间的公共元素

Question

提问by Tikku

Hope you could help me. I am new to python and pandas, so please bear with me. I am trying to find the common word between three data frames and I am using Jupiter Notebook.

希望你能帮助我。我是 python 和 pandas 的新手，所以请多多包涵。我正在尝试在三个数据框之间找到常用词，并且我正在使用 Jupiter Notebook。

Just for example:

举个例子：

df1=
A
dog
cat
cow 
duck
snake

df2=
A
pig
snail
bird
dog

df3=
A
eagle
dog 
snail
monkey

There is only one column in all data frames that is A. I would like to find

所有数据框中只有一列是 A。我想找到

the common word among all columns
the words that are unique to their own columns and not in common.

所有列中的常用词
对它们自己的列来说是独一无二的而不是共同的词。

Example:

例子：

duck is unique to df1, snail is unique to df2 and monkey is unique to df3.

duck 是 df1 独有的，snail 是 df2 独有的，monkey 是 df3 独有的。

I am using the below code to some use but not getting what I want straightforward,

我正在使用下面的代码来做一些用途，但没有得到我想要的直接，

df1[df1['A'].isin(df2['A']) & (df2['A']) & (df3['A'])]

Kindly let me know where I am going wrong. Cheers

请让我知道我哪里出错了。干杯

Answer 1

采纳答案by cs95

The problem with your current approach is that you need to chainmultiple isincalls. What's worse is that you'd need to keep track of which dataframe is the largest, and you call isinon thatone. Otherwise, it doesn't work.

您当前方法的问题在于您需要链接多个isin调用。更糟糕的是，你需要跟踪哪些数据帧是最大的，你打电话isin的那一个。否则，它不起作用。

To make things easy, you can use np.intersect1d:

为了使事情变得简单，您可以使用np.intersect1d：

>>> np.intersect1d(df3.A, np.intersect1d(df1.A, df2.A))
array(['dog'], dtype=object)

Similar method using functools.reduce+ intersect1dby piRSquared:

piRSquared使用functools.reduce+intersect1d的类似方法：

>>> from functools import reduce # python 3 only
>>> reduce(np.intersect1d, [df1.A, df2.A, df3.A])
array(['dog'], dtype=object)

Answer 2

回答by piRSquared

Simplest way is to use setintersection

最简单的方法是使用set交集

list(set(df1.A) & set(df2.A) & set(df3.A))

['dog']

However if you have a long list of these things, I'd use reducefrom functools. This same technique can be used with @c???s????'s use of np.intersect1das well.

但是，如果您有很多这些东西的清单，我会使用reducefrom functools。同样的技术也可以与@c???s???? 的使用一起使用np.intersect1d。

from functools import reduce

list(reduce(set.intersection, map(set, [df1.A, df2.A, df3.A])))

['dog']

pandas 查找多个数据框列之间的公共元素

提问by Tikku

采纳答案by cs95

回答by piRSquared

相关推荐

最近更新

标签

pandas 查找多个数据框列之间的公共元素

提问by Tikku

采纳答案by cs95

回答by piRSquared

相关推荐

pandas 分组并减去熊猫中的列

Pandas 数据框按多列分组

将 Pandas 数据帧中的列从 float 转换为 int

pandas 如何将数据帧列乘以浮点常量？

相关推荐

最近更新

标签