pandas 熊猫:跨行条件计数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23663623/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:03:13  来源:igfitidea点击:

pandas: conditional count across row

pythonpandas

提问by DataSwede

I have a dataframe that has months for columns, and various departments for rows.

我有一个数据框,其中有几个月的列和各个部门的行。

                2013April  2013May  2013June
        Dep1        0         10        15
        Dep2        10        15        20

I'm looking to add a column that counts the number of months that have a value greater than 0. Ex:

我想添加一个列来计算值大于 0 的月数。例如:

                2013April  2013May  2013June  Count>0 
        Dep1        0         10        15       2
        Dep2        10        15        20       3

The number of columns this function needs to span is variable. I think defining a function then using .apply is the solution, but I can't seem to figure it out.

此函数需要跨越的列数是可变的。我认为定义一个函数然后使用 .apply 是解决方案,但我似乎无法弄清楚。

回答by acushner

first, pick your columns, cols

首先,选择你的专栏, cols

df[cols].apply(lambda s: (s > 0).sum(), axis=1)

this takes advantage of the fact that Trueand Falseare 1and 0respectively in python.

这利用了TrueFalseare10分别在 python中的事实。

actually, there's a better way:

其实还有更好的办法:

(df[cols] > 0).sum(1)

because this takes advantage of numpy vectorization

因为这利用了 numpy 矢量化

%timeit df.apply(lambda s: (s > 0).sum(), axis=1)
10 loops, best of 3: 141 ms per loop

%timeit (df > 0).sum(1)
1000 loops, best of 3: 319 μs per loop