Pandas 数据帧上的累积求和函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33310050/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:06:05  来源:igfitidea点击:

Cumulative Sum Function on Pandas Data Frame

pythonpandas

提问by AME

I am attempting to capture a "running" cumulative sum given a series of period amounts.

我试图在给定一系列期间金额的情况下捕获“运行”累积总和。

See example:

见示例:

enter image description here

在此处输入图片说明

df = df[1:4].cumsum() # this doesn't return the desired result

采纳答案by Brian

You're looking for the axisparameter. Many Pandas functions take this argument to apply an operation across the columns or across the rows. Use axis=0to apply row-wise and axis=1to apply column-wise. This operation is actually traversing the columns, so you want axis=1.

您正在寻找axis参数。许多 Pandas 函数都使用此参数来跨列或跨行应用操作。用于axis=0按行axis=1应用和按列应用。此操作实际上是遍历列,因此您需要axis=1.

df.cumsum(axis=1)by itself works on your example to produce the output table.

df.cumsum(axis=1)本身适用于您的示例以生成输出表。

In [3]: df.cumsum(axis=1)
Out[3]:
      1   2   3   4
10   16  30  41  61
51   13  29  40  50
13   11  30  45  61
321  12  27  37  52

I suspect you're interested in restricting to a specific range of columns, though. To do that, you can use .locwith the column labels (strings in mine).

不过,我怀疑您对限制特定范围的列感兴趣。为此,您可以使用.loc列标签(我的字符串)。

In [4]: df.loc[:, '2':'3'].cumsum(axis=1)
Out[4]:
      2   3
10   14  25
51   16  27
13   19  34
321  15  25

.locis label-based and is inclusive of the bounds. If you want to find out more about indexing in Pandas, check the docs.

.loc是基于标签的并且包含边界。如果您想了解有关 Pandas 索引的更多信息,请查看文档

回答by chrisb

You want axis=1to sum across the rows.

您想axis=1对各行求和。

df.cumsum(axis=1)

Side-note - doing [1:4]slices the rowsby default (i.e. numpy or list-like semantics). If you want to select columns by label, use df.loc[:, 1:4]

旁注 -默认情况下[1:4]进行切片(即 numpy 或类似列表的语义)。如果要按标签选择列,请使用df.loc[:, 1:4]