pandas 取熊猫系列中每 N 行的总和

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47239332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:45:33  来源:igfitidea点击:

Take the sum of every N rows in a pandas series

pythonpandas

提问by HirofumiTamori

Suppose

认为

s = pd.Series(range(50))

s = pd.Series(range(50))

0      0
1      1
2      2
3      3
...
48     48
49     49

How can I get the new series that consists of sum of every n rows?

如何获得由每 n 行之和组成的新系列?

Expected result is like below, when n = 5;

当 n = 5 时,预期结果如下;

0      10
1      35
2      60
3      85
...
8      210
9      235

If using loc or iloc and loop by python, of course it can be accomplished, however I believe it could be done simply in Pandas way.

如果使用 loc 或 iloc 和 python 循环,当然可以完成,但我相信它可以简单地用 Pandas 方式完成。

Also, this is a very simplified example, I don't expect the explanation of the sequences:). Actual data series I'm trying has the time index and the the number of events occurred in every second as the values.

此外,这是一个非常简化的示例,我不希望对序列进行解释:)。我正在尝试的实际数据系列具有时间索引和每秒发生的事件数作为值。

回答by cs95

GroupBy.sum

GroupBy.sum

N = 5
s.groupby(s.index // N).sum()

0     10
1     35
2     60
3     85
4    110
5    135
6    160
7    185
8    210
9    235
dtype: int64

Chunk the index into groups of 5 and group accordingly.

将索引分成 5 组并相应地分组。



numpy.reshape+ sum

numpy.reshape+ sum

If the size is a multiple of N (or 5), you can reshape and add:

如果大小是 N(或 5)的倍数,则可以重塑并添加:

s.values.reshape(-1, N).sum(1)
# array([ 10,  35,  60,  85, 110, 135, 160, 185, 210, 235])


numpy.add.at

numpy.add.at

b = np.zeros(len(s) // N)
np.add.at(b, s.index // N, s.values)
b
# array([ 10.,  35.,  60.,  85., 110., 135., 160., 185., 210., 235.])