pandas - 按列名屏蔽数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34064948/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas - mask dataframe by column name
提问by Fabio Lamanna
Starting from this simple dataframe df
:
从这个简单的数据框开始df
:
col1,col2
1,3
2,1
3,8
I would like to apply a boolean mask
in function of the name of the column. I know that it is easy for values:
我想mask
在列名称的函数中应用一个布尔值。我知道价值观很容易:
mask = df <= 1
df = df[mask]
which returns:
返回:
mask:
面具:
col1 col2
0 True False
1 False True
2 False False
df:
df:
col1 col2
0 1 NaN
1 NaN 1
2 NaN NaN
as expected. Now I would like to obtain a boolean mask based on the column name, something like:
正如预期的那样。现在我想根据列名获得一个布尔掩码,例如:
mask = df == df['col_1']
which should return:
应该返回:
mask
面具
col1 col2
0 True False
1 True False
2 True False
EDIT:
编辑:
This seems weird, but I need those kind of masks to later filtering by columns seaborn heatmaps.
这看起来很奇怪,但我需要这些掩码,以便以后按列 seaborn 热图进行过滤。
回答by KT.
As noted in the comments, situations where you would need to get a "mask" like that seem rare (and chances are, you not in one of them). Consequently, there is probably no nice "built-in" solution for them in Pandas.
正如评论中所指出的,您需要获得这样的“面具”的情况似乎很少见(而且很有可能,您不在其中之一)。因此,在 Pandas 中可能没有很好的“内置”解决方案。
None the less, you can achieve what you need, using a hack like the following, for example:
尽管如此,您可以使用如下所示的 hack 来实现您需要的功能,例如:
mask = (df == df) & (df.columns == 'col_1')
Update:. As noted in the comments, if your data frame contains nulls, the mask computed this way will always be False
at the corresponding locations. If this is a problem, the safer option is:
更新:。如评论中所述,如果您的数据框包含空值,则以这种方式计算的掩码将始终False
位于相应的位置。如果这是一个问题,更安全的选择是:
mask = ((df == df) | df.isnull()) & (df.columns == 'col_1')
回答by Anton Protopopov
You could transpose your dataframe than compare it with the columns and then transpose back. A bit weird but working example:
您可以转置您的数据框,而不是将其与列进行比较,然后转回。有点奇怪但有效的例子:
import pandas as pd
from io import StringIO
data = """
col1,col2
1,3
2,1
3,8
"""
df = pd.read_csv(StringIO(data))
mask = (df.T == df['col1']).T
In [176]: df
Out[176]:
col1 col2
0 1 3
1 2 1
2 3 8
In [178]: mask
Out[178]:
col1 col2
0 True False
1 True False
2 True False
EDIT
编辑
I found another answer for that, you could use isin
method:
我找到了另一个答案,你可以使用isin
方法:
In [41]: df.isin(df.col1)
Out[41]:
col1 col2
0 True False
1 True False
2 True False
EDIT2
编辑2
As @DSM show in the comment that these two cases not working correctly. So you should use @KT. method. But.. Let's play more with transpose:
正如@DSM 在评论中显示的那样,这两种情况无法正常工作。所以你应该使用@KT。方法。但是..让我们玩转转:
df.col2 = df.col1
In [149]: df
Out[149]:
col1 col2
0 1 1
1 2 2
2 3 3
In [147]: df.isin(df.T[df.columns == 'col1'].T)
Out[147]:
col1 col2
0 True False
1 True False
2 True False