Python Pandas 根据列的最大值删除列

Question

提问by professorDante

Im just getting going with Pandas as a tool for munging two dimensional arrays of data. It's super overwhelming, even after reading the docs. You can do so much that I can't figure out how to do anything, if that makes any sense.

我刚刚开始使用 Pandas 作为处理二维数据数组的工具。即使在阅读了文档之后，它也是超级压倒性的。你可以做的太多，我无法弄清楚如何做任何事情，如果这有意义的话。

My dataframe (simplified):

我的数据框（简化）：

Date       Stock1  Stock2   Stock3
2014.10.10  74.75  NaN     NaN
2014.9.9    NaN    100.95  NaN 
2010.8.8    NaN    NaN     120.45

So each column only has one value.

所以每一列只有一个值。

I want to remove all columns that have a max value less than x. So say here as an example, if x = 80, then I want a new DataFrame:

我想删除最大值小于 x 的所有列。所以在这里举个例子，如果 x = 80，那么我想要一个新的 DataFrame：

Date        Stock2   Stock3
2014.10.10   NaN     NaN
2014.9.9     100.95  NaN 
2010.8.8     NaN     120.45

How can this be acheived? I've looked at dataframe.max() which gives me a series. Can I use that, or have a lambda function somehow in select()?

怎样才能做到这一点？我看过 dataframe.max() 这给了我一个系列。我可以使用它，或者在 select() 中有一个 lambda 函数吗？

Answer 1

回答by Adam Hughes

Use the df.max()to index with.

使用df.max()来索引。

In [19]: from pandas import DataFrame

In [23]: df = DataFrame(np.random.randn(3,3), columns=['a','b','c'])

In [36]: df
Out[36]: 
          a         b         c
0 -0.928912  0.220573  1.948065
1 -0.310504  0.847638 -0.541496
2 -0.743000 -1.099226 -1.183567


In [24]: df.max()
Out[24]: 
a   -0.310504
b    0.847638
c    1.948065
dtype: float64

Next, we make a boolean expression out of this:

接下来，我们从中创建一个布尔表达式：

In [31]: df.max() > 0
Out[31]: 
a    False
b     True
c     True
dtype: bool

Next, you can index df.columns by this (this is called boolean indexing):

接下来，您可以通过此索引 df.columns （这称为布尔索引）：

In [34]: df.columns[df.max() > 0]
Out[34]: Index([u'b', u'c'], dtype='object')

Which you can finally pass to DF:

您最终可以传递给 DF：

In [35]: df[df.columns[df.max() > 0]]
Out[35]: 
          b         c
0  0.220573  1.948065
1  0.847638 -0.541496
2 -1.099226 -1.183567

Of course, instead of 0, you use any value that you want as the cutoff for dropping.

当然，您可以使用任何您想要的值作为丢弃的截止值，而不是 0。

Python Pandas 根据列的最大值删除列

提问by professorDante

回答by Adam Hughes

相关推荐

最近更新

标签

Python Pandas 根据列的最大值删除列

提问by professorDante

回答by Adam Hughes

相关推荐

pandas.DataFrame.equals 的合约

pandas 如何将 numpy.timedelta64 转换为分钟

pandas 根据条件替换数据框列中的值

如何在 Tkinter GUI 窗口中显示 Pandas 数据框的内容

相关推荐

最近更新

标签