Python 在 Pandas 中添加计算列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45393123/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:58:35  来源:igfitidea点击:

Adding calculated column in Pandas

pythonpandas

提问by JD2775

I have a dataframe with 10 columns. I want to add a new column 'age_bmi' which should be a calculated column multiplying 'age' * 'bmi'. age is an INT, bmi is a FLOAT.

我有一个包含 10 列的数据框。我想添加一个新列 'age_bmi',它应该是一个乘以 'age' * 'bmi' 的计算列。age 是一个 INT,bmi 是一个 FLOAT。

That then creates the new dataframe with 11 columns.

然后创建具有 11 列的新数据框。

Something I am doing isn't quite right. I think it's a syntax issue. Any ideas?

我正在做的事情并不完全正确。我认为这是一个语法问题。有任何想法吗?

Thanks

谢谢

df2['age_bmi'] = df(['age'] * ['bmi'])
print(df2)

回答by Cory Madden

try df2['age_bmi'] = df.age * df.bmi.

试试df2['age_bmi'] = df.age * df.bmi

You're trying to call the dataframe as a function, when you need to get the values of the columns, which you can access by key like a dictionary or by property if it's a lowercase name with no spaces that doesn't match a built-in DataFrame method.

您正在尝试将数据框作为函数调用,当您需要获取列的值时,您可以通过键(如字典)或属性(如果它是一个没有空格的小写名称与内置的不匹配)访问这些值-in DataFrame 方法。

Someone linked this in a comment the other day and it's pretty awesome. I recommend giving it a watch, even if you don't do the exercises: https://www.youtube.com/watch?v=5JnMutdy6Fw

前几天有人在评论中链接了这个,这非常棒。即使你不做练习,我也建议给它一块手表:https: //www.youtube.com/watch?v=5JnMutdy6Fw

回答by Zero

As pointed by Cory, you're calling a dataframe as a function, that'll not work as you expect. Here are 4 ways to multiple two columns, in most cases you'd use the first method.

正如 Cory 所指出的,您将数据帧作为函数调用,这不会像您预期的那样工作。这里有 4 种方法可以多列两列,在大多数情况下,您会使用第一种方法。

In [299]: df['age_bmi'] = df.age * df.bmi

or,

或者,

In [300]: df['age_bmi'] = df.eval('age*bmi')

or,

或者,

In [301]: df['age_bmi'] = pd.eval('df.age*df.bmi')

or,

或者,

In [302]: df['age_bmi'] = df.age.mul(df.bmi)