pandas - 按列名屏蔽数据框

Question

提问by Fabio Lamanna

Starting from this simple dataframe df:

从这个简单的数据框开始df：

col1,col2
1,3
2,1
3,8

I would like to apply a boolean maskin function of the name of the column. I know that it is easy for values:

我想mask在列名称的函数中应用一个布尔值。我知道价值观很容易：

mask = df <= 1

df = df[mask]

which returns:

返回：

mask:

面具：

    col1   col2
0   True  False
1  False   True
2  False  False

df:

df：

   col1  col2
0     1   NaN
1   NaN     1
2   NaN   NaN

as expected. Now I would like to obtain a boolean mask based on the column name, something like:

正如预期的那样。现在我想根据列名获得一个布尔掩码，例如：

mask = df == df['col_1']

which should return:

应该返回：

mask

面具

    col1   col2
0   True  False
1   True  False
2   True  False

EDIT:

编辑：

This seems weird, but I need those kind of masks to later filtering by columns seaborn heatmaps.

这看起来很奇怪，但我需要这些掩码，以便以后按列 seaborn 热图进行过滤。

Answer 1

回答by KT.

As noted in the comments, situations where you would need to get a "mask" like that seem rare (and chances are, you not in one of them). Consequently, there is probably no nice "built-in" solution for them in Pandas.

正如评论中所指出的，您需要获得这样的“面具”的情况似乎很少见（而且很有可能，您不在其中之一）。因此，在 Pandas 中可能没有很好的“内置”解决方案。

None the less, you can achieve what you need, using a hack like the following, for example:

尽管如此，您可以使用如下所示的 hack 来实现您需要的功能，例如：

mask = (df == df) & (df.columns == 'col_1')

Update:. As noted in the comments, if your data frame contains nulls, the mask computed this way will always be Falseat the corresponding locations. If this is a problem, the safer option is:

更新：。如评论中所述，如果您的数据框包含空值，则以这种方式计算的掩码将始终False位于相应的位置。如果这是一个问题，更安全的选择是：

mask = ((df == df) | df.isnull()) & (df.columns == 'col_1')

Answer 2

回答by Anton Protopopov

You could transpose your dataframe than compare it with the columns and then transpose back. A bit weird but working example:

您可以转置您的数据框，而不是将其与列进行比较，然后转回。有点奇怪但有效的例子：

import pandas as pd
from io import StringIO

data = """
col1,col2
1,3
2,1
3,8
"""

df = pd.read_csv(StringIO(data))
mask = (df.T == df['col1']).T

In [176]: df
Out[176]:
   col1  col2
0     1     3
1     2     1
2     3     8


In [178]: mask
Out[178]:
   col1   col2
0  True  False
1  True  False
2  True  False

EDIT

编辑

I found another answer for that, you could use isinmethod:

我找到了另一个答案，你可以使用isin方法：

In [41]: df.isin(df.col1)
Out[41]:
   col1   col2
0  True  False
1  True  False
2  True  False

EDIT2

编辑2

As @DSM show in the comment that these two cases not working correctly. So you should use @KT. method. But.. Let's play more with transpose:

正如@DSM 在评论中显示的那样，这两种情况无法正常工作。所以你应该使用@KT。方法。但是..让我们玩转转：

df.col2 = df.col1

In [149]: df
Out[149]:
   col1  col2
0     1     1
1     2     2
2     3     3

In [147]: df.isin(df.T[df.columns == 'col1'].T)
Out[147]:
   col1   col2
0  True  False
1  True  False
2  True  False

pandas - 按列名屏蔽数据框

提问by Fabio Lamanna

回答by KT.

回答by Anton Protopopov

相关推荐

最近更新

标签

pandas - 按列名屏蔽数据框

提问by Fabio Lamanna

回答by KT.

回答by Anton Protopopov

相关推荐

pandas 类型错误：write() 中不支持类型 <type 'list'>

pandas 熊猫数据框所有列的平均值？

pandas 为什么安装成功后无法导入pandas？

pandas Python - 从数据框熊猫中检索过去 30 天的数据

相关推荐

最近更新

标签