pandas 取熊猫系列中每 N 行的总和
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47239332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Take the sum of every N rows in a pandas series
提问by HirofumiTamori
Suppose
认为
s = pd.Series(range(50))
s = pd.Series(range(50))
0 0
1 1
2 2
3 3
...
48 48
49 49
How can I get the new series that consists of sum of every n rows?
如何获得由每 n 行之和组成的新系列?
Expected result is like below, when n = 5;
当 n = 5 时,预期结果如下;
0 10
1 35
2 60
3 85
...
8 210
9 235
If using loc or iloc and loop by python, of course it can be accomplished, however I believe it could be done simply in Pandas way.
如果使用 loc 或 iloc 和 python 循环,当然可以完成,但我相信它可以简单地用 Pandas 方式完成。
Also, this is a very simplified example, I don't expect the explanation of the sequences:). Actual data series I'm trying has the time index and the the number of events occurred in every second as the values.
此外,这是一个非常简化的示例,我不希望对序列进行解释:)。我正在尝试的实际数据系列具有时间索引和每秒发生的事件数作为值。
回答by cs95
GroupBy.sum
GroupBy.sum
N = 5
s.groupby(s.index // N).sum()
0 10
1 35
2 60
3 85
4 110
5 135
6 160
7 185
8 210
9 235
dtype: int64
Chunk the index into groups of 5 and group accordingly.
将索引分成 5 组并相应地分组。
numpy.reshape
+ sum
numpy.reshape
+ sum
If the size is a multiple of N (or 5), you can reshape and add:
如果大小是 N(或 5)的倍数,则可以重塑并添加:
s.values.reshape(-1, N).sum(1)
# array([ 10, 35, 60, 85, 110, 135, 160, 185, 210, 235])
numpy.add.at
numpy.add.at
b = np.zeros(len(s) // N)
np.add.at(b, s.index // N, s.values)
b
# array([ 10., 35., 60., 85., 110., 135., 160., 185., 210., 235.])