pandas 基于循环变量和附加字符串创建新列名

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48587799/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:07:29  来源:igfitidea点击:

Creating a new column name based on a loop variable and an additional string

pythonpandasdataframe

提问by A.Papa

I want to create percentage change column for each column that is a float in my dataframe and stored it in a newn column each time with the name of the initial column and the add on "_change"

我想为我的数据框中的每个浮动列创建百分比变化列,并每次将其存储在一个 newn 列中,其中包含初始列的名称和“_change”的添加

I tried this but it does not seem to work any idea?

我试过这个,但它似乎没有任何想法?

for col in df.columns:
        if df[col].dtypes == "float":
           df[ col&'_change'] = (df.col - df.groupby(['New_ID']).col.shift(1))/ df.col

for example if my column is df["Expenses"] I would like to save the percentage change in df["Expenses_change"] Edited for adding example data frame and output

例如,如果我的列是 df["Expenses"] 我想保存 df["Expenses_change"] 中的百分比变化编辑以添加示例数据框和输出

df initially

df 最初

Index   ID  Reporting_Date  Sales_Am    Exp_Am
     0   1   01/01/2016        1000      900
     1   1   02/01/2016        1050      950
     2   1   03/01/2016        1060      960
     3   2   01/01/2016        2000      1850
     4   2   02/01/2016        2500      2350
     4   2   03/01/2016        3000      2850

after the loop

循环后

Index   ID  Reporting_Date  Sales_Am  Sales_Am_chge  Exp_Am  Exp_Am_chge
0        1  01/01/2016         1000     Null          900      Null
1        1  02/01/2016         1050     5%            950      6%
2        1  03/01/2016         1060     1%            960      1%
3        2  01/01/2016         2000     Null          1850     Null
4        2  02/01/2016         2500     25%           2350     27%
4        2  03/01/2016         3000     20%           2850     21%

keep in mind that i have more than 2 columns on my dataframe.

请记住,我的数据框中有超过 2 列。

回答by Zach

Why are you using '&' instead of '+' in

为什么你使用 '&' 而不是 '+'

df[ col&'_change']

?

?

回答by jpp

String concatenation is performed in python via the +operator.

字符串连接是通过+操作符在 python 中执行的。

So changing to col+'_change'will fix this issue for you.

因此,更改为col+'_change'将为您解决此问题。

You might find it helpful to read the relevant python documentation.

您可能会发现阅读相关的Python 文档很有帮助。

回答by DanSan

As it has been mentioned in other answers, just by changing & for + should do it. I was getting issues with using dots instead of square brackets so I changed them too.

正如其他答案中提到的那样,只需更改 & for + 即可。我在使用点而不是方括号时遇到问题,所以我也改变了它们。

This code has been tested in Python 3 and it works :)

此代码已在 Python 3 中进行了测试,并且可以正常工作:)

for col in df.columns:
        if df[col].dtypes == "float":
               df[col+'_change'] = (df[col] - df.groupby(['repeat_present'])[col].shift(1))/ df[col]

Enjoy!

享受!