python pandas:如果有条件,则删除 df 列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30351125/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:22:47  来源:igfitidea点击:

python pandas: drop a df column if condition

pythonpandasdataframe

提问by Boosted_d16

I would like to drop a given column from a pandas dataframe IF all the values in the column is "0%".

如果列中的所有值都是“0%”,我想从 Pandas 数据框中删除给定的列。

my df:

我的 df:

data = {'UK': ['11%', '16%', '7%', '52%', '2%', '5%', '3%', '3%'],
        'US': ['0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%'],
        'DE': ['11%', '16%', '7%', '52%', '2%', '5%', '3%', '3%'],
        'FR': ['11%', '16%', '7%', '52%', '2%', '5%', '3%', '3%']
        }
dummy_df = pd.DataFrame(data, 
                        index=    ['cat1','cat2','cat3','cat4','cat5','cat6','cat7','cat8'], 
                        columns=['UK', 'US', 'DE', 'FR'])

my code so far:

到目前为止我的代码:

dummy_df.drop(dummy_df == '0%',inplace=True)

I get a value error:

我收到一个值错误:

ValueError: labels ['UK' 'US' 'DE' 'FR'] not contained in axis

回答by joris

In [186]: dummy_df.loc[:, ~(dummy_df == '0%').all()]
Out[186]:
       UK   DE   FR
cat1  11%  11%  11%
cat2  16%  16%  16%
cat3   7%   7%   7%
cat4  52%  52%  52%
cat5   2%   2%   2%
cat6   5%   5%   5%
cat7   3%   3%   3%
cat8   3%   3%   3%

Explanation:

解释:

The comparison with '0%' you already got, this gives the following dataframe:

与您已经获得的 '0%' 进行比较,这给出了以下数据框:

In [182]: dummy_df == '0%'
Out[182]:
         UK    US     DE     FR
cat1  False  True  False  False
cat2  False  True  False  False
cat3  False  True  False  False
cat4  False  True  False  False
cat5  False  True  False  False
cat6  False  True  False  False
cat7  False  True  False  False
cat8  False  True  False  False

Now we want to know which columns has all Trues:

现在我们想知道哪些列全True是 s:

In [183]: (dummy_df == '0%').all()
Out[183]:
UK    False
US     True
DE    False
FR    False
dtype: bool

And finally, we can index with these boolean values (but taking the opposite with ~as want don'twant to select where this is True): dummy_df.loc[:, ~(dummy_df == '0%').all()].

最后,我们可以用这些布尔值(但考虑以相反的指标~作为想想要选择在那里,这是Truedummy_df.loc[:, ~(dummy_df == '0%').all()]

Similarly, you can also do: dummy_df.loc[:, (dummy_df != '0%').any()](selects columns where at least one value is not equal to '0%')

同样,您也可以这样做:(dummy_df.loc[:, (dummy_df != '0%').any()]选择至少一个值不等于“0%”的列)

回答by Zero

First get the columns where all values != '0%'

首先获取所有值所在的列 != '0%'

In [163]: cols = (dummy_df != '0%').any()

In [164]: cols
Out[164]:
UK     True
US    False
DE     True
FR     True
dtype: bool

Then call only colscolumns which are True

然后只调用colsTrue

In [165]: dummy_df[cols[cols].index]
Out[165]:
       UK   DE   FR
cat1  11%  11%  11%
cat2  16%  16%  16%
cat3   7%   7%   7%
cat4  52%  52%  52%
cat5   2%   2%   2%
cat6   5%   5%   5%
cat7   3%   3%   3%
cat8   3%   3%   3%