pandas 熊猫：跨行条件计数

Question

提问by DataSwede

I have a dataframe that has months for columns, and various departments for rows.

我有一个数据框，其中有几个月的列和各个部门的行。

                2013April  2013May  2013June
        Dep1        0         10        15
        Dep2        10        15        20

I'm looking to add a column that counts the number of months that have a value greater than 0. Ex:

我想添加一个列来计算值大于 0 的月数。例如：

                2013April  2013May  2013June  Count>0 
        Dep1        0         10        15       2
        Dep2        10        15        20       3

The number of columns this function needs to span is variable. I think defining a function then using .apply is the solution, but I can't seem to figure it out.

此函数需要跨越的列数是可变的。我认为定义一个函数然后使用 .apply 是解决方案，但我似乎无法弄清楚。

Answer 1

回答by acushner

first, pick your columns, cols

首先，选择你的专栏， cols

df[cols].apply(lambda s: (s > 0).sum(), axis=1)

this takes advantage of the fact that Trueand Falseare 1and 0respectively in python.

这利用了True和Falseare1和0分别在 python中的事实。

actually, there's a better way:

其实还有更好的办法：

(df[cols] > 0).sum(1)

because this takes advantage of numpy vectorization

因为这利用了 numpy 矢量化

%timeit df.apply(lambda s: (s > 0).sum(), axis=1)
10 loops, best of 3: 141 ms per loop

%timeit (df > 0).sum(1)
1000 loops, best of 3: 319 μs per loop

pandas 熊猫：跨行条件计数

提问by DataSwede

回答by acushner

actually, there's a better way:

其实还有更好的办法：

相关推荐

最近更新

标签

pandas 熊猫：跨行条件计数

提问by DataSwede

回答by acushner

actually, there's a better way:

其实还有更好的办法：

相关推荐

Pandas 数据框中的分类变量？

pandas 为什么 Python ggplot 返回名称“aes”未定义？

pandas 从 github python 下载和访问数据

在 Pandas 数据框中通过多索引选择（子集）

相关推荐

最近更新

标签