pandas 熊猫:跨行条件计数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23663623/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas: conditional count across row
提问by DataSwede
I have a dataframe that has months for columns, and various departments for rows.
我有一个数据框,其中有几个月的列和各个部门的行。
2013April 2013May 2013June
Dep1 0 10 15
Dep2 10 15 20
I'm looking to add a column that counts the number of months that have a value greater than 0. Ex:
我想添加一个列来计算值大于 0 的月数。例如:
2013April 2013May 2013June Count>0
Dep1 0 10 15 2
Dep2 10 15 20 3
The number of columns this function needs to span is variable. I think defining a function then using .apply is the solution, but I can't seem to figure it out.
此函数需要跨越的列数是可变的。我认为定义一个函数然后使用 .apply 是解决方案,但我似乎无法弄清楚。
回答by acushner
first, pick your columns, cols
首先,选择你的专栏, cols
df[cols].apply(lambda s: (s > 0).sum(), axis=1)
this takes advantage of the fact that Trueand Falseare 1and 0respectively in python.
这利用了True和Falseare1和0分别在 python中的事实。
actually, there's a better way:
其实还有更好的办法:
(df[cols] > 0).sum(1)
because this takes advantage of numpy vectorization
因为这利用了 numpy 矢量化
%timeit df.apply(lambda s: (s > 0).sum(), axis=1)
10 loops, best of 3: 141 ms per loop
%timeit (df > 0).sum(1)
1000 loops, best of 3: 319 μs per loop

