pandas 如何使用熊猫计算与起始值相比的百分比变化?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35090498/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:35:16  来源:igfitidea点击:

How to calculate percent change compared to the beginning value using pandas?

pythonpandasdataframepercentage

提问by E.K.

I have a DataFrameand need to calculate percent change compared to the beginning of the year by companies. Is there any way to use pct_change()or other method to perform this task? Thanks!

我有一个DataFrame并且需要计算与公司年初相比的百分比变化。有什么方法可以使用pct_change()或其他方法来执行此任务吗?谢谢!

dflooks like

df好像

security    date        price
IBM         1/1/2016    100
IBM         1/2/2016    102
IBM         1/3/2016    108
AAPL        1/1/2016    1000
AAPL        1/2/2016    980
AAPL        1/3/2016    1050
AAPL        1/4/2016    1070

results I want

我想要的结果

security    date        price   change
IBM         1/1/2016    100     NA
IBM         1/2/2016    102     2%
IBM         1/3/2016    108     8%
AAPL        1/1/2016    1000    NA
AAPL        1/2/2016    980     -2%
AAPL        1/3/2016    1050    5%
AAPL        1/4/2016    1070    7%

回答by Stefan

Sounds like you are looking for an expanding_windowversion of pct_change(). This doesn't exist out of the box AFAIK, but you could roll your own:

像你这样的声音正在寻找一个expanding_window版本pct_change()。这不是开箱即用的 AFAIK,但您可以推出自己的:

df.groupby('security')['price'].apply(lambda x: x.div(x.iloc[0]).subtract(1).mul(100))

回答by fivetentaylor

This works, assuming you're already ordered by date within each possible grouping.

这是有效的,假设您已经在每个可能的分组中按日期排序。

def pct_change(df):
    df['pct'] = 100 * (1 - df.iloc[0].price / df.price)
    return df

df.groupby('security').apply(pct_change)

回答by Marco

I had the same problem, but solved it his way:

我遇到了同样的问题,但以他的方式解决了:

(only difference was that the columns would be your company and not the row.)

(唯一的区别是列将是您的公司而不是行。)

for each column of my dataframe I did:

对于我的数据框的每一列,我做了:

df[column] = df[column].pct_change().cumsum()

df[column] = df[column].pct_change().cumsum()

pct_change()calculates the change between now and the last value, and cumcum()adds it all together.

pct_change()计算现在和最后一个值之间的变化,cumcum()并将它们加在一起。