pandas 过滤掉多索引数据框中具有零值的行/列

Question

提问by Dov

I have the following panda MultiIndex dataframe in python

我在 python 中有以下Pandas MultiIndex 数据框

             0         1         2         3 
bar one  0.000000 -0.929631  0.688818 -1.264180
    two  1.130977  0.063277  0.161366  0.598538
baz one  1.420532  0.052530 -0.701400  0.678847
    two -1.197097  0.314381  0.269551  1.115699
foo one -0.077463  0.437145 -0.202377  0.260864
    two -0.815926 -0.508988 -1.238619  0.899013
qux one -0.347863 -0.999990 -1.428958 -1.488556
    two  1.218567 -0.593987  0.099003  0.800736

My questions, how can I filter out:

我的问题，我怎样才能过滤掉：

Columns that contains zero values -- column 0, in the above example.
With regrade to rows filtering. How can I filter rows with zeros: (bar, one) alone and how can I filter both (bar, one) and (bar, two)?
(Apologies for my not native English ;)

包含零值的列——上例中的第 0 列。
重新升级到行过滤。如何过滤带有零的行：单独的 (bar, one) 以及如何过滤 (bar, one) 和 (bar, two)？
（为我的母语不是英语而道歉；）

Answer 1

回答by Julien Spronck

To filter out columns that contain zero values, you can use

要过滤掉包含零值的列，您可以使用

df2 = df.loc[:, (df != 0).all(axis=0)]

To filter out rows that contain zero values, you can use

要过滤掉包含零值的行，您可以使用

df2 = df.loc[(df != 0).all(axis=1), :]

To filter out rows, you can use

要过滤掉行，您可以使用

df2 = df.drop('bar') ## drops both 'bar one' and 'bar two'
df2 = df.drop(('baz', 'two')) ## drops only 'baz two'

For example,

例如，

import numpy as np
arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']), np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
df.ix['bar','one'][2] = 0
df = df.loc[:, (df != 0).all(axis=0)]
df = df.drop('bar')
df = df.drop(('baz', 'two'))

#                 0         1         3
# baz one  0.686969  0.410614  0.841630
# foo one  1.522938  0.555734 -1.585507
#     two -0.975976  0.522571 -0.041386
# qux one -0.991787  0.154645  0.179536
#     two -0.725685  0.809784  0.394708

Another way if you have no NaN values in your dataframe is to transform your 0s into NaN and drop the columns or the rows that have NaN:

如果数据框中没有 NaN 值，另一种方法是将 0 转换为 NaN 并删除具有 NaN 的列或行：

df[df != 0.].dropna(axis=1) # to remove the columns with 0
df[df != 0.].dropna(axis=0) # to remove the rows with 0

Finally, if you want to drop the whole 'bar' row if there is one zero value, you can do this:

最后，如果你想在有一个零值的情况下删除整个 'bar' 行，你可以这样做：

indices = df.loc[(df == 0).any(axis=1), :].index.tolist() ## multi-index values that contain 0
for ind in indices:
    df = df.drop(ind[0])

pandas 过滤掉多索引数据框中具有零值的行/列

提问by Dov

回答by Julien Spronck

相关推荐

最近更新

标签

pandas 过滤掉多索引数据框中具有零值的行/列

提问by Dov

回答by Julien Spronck

相关推荐

pandas 如何并行合并两个熊猫数据帧（多线程或多处理）

如何从 Pandas DataFrame 绘制热图

Pandas 按索引删除列会删除所有具有相同名称的列

检测 pandas.DataFrame 中的列是否为分类的一个好的启发式方法是什么？

相关推荐

最近更新

标签