pandas 如何在数据框中添加新的计算列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42796354/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:11:45  来源:igfitidea点击:

How can I add a new computed column in a dataframe?

pythonpandas

提问by MB41

I'm trying to compute the age of a person from the data that I have:

我正在尝试根据我拥有的数据计算一个人的年龄:

Data columns in 'Person' Dataframe:
TodaysDate   non-null datetime64[ns]
YOB          non-null float64

So I want to make a new column inside that dataframe called 'Age' and so far I have the following code:

所以我想在该数据框中创建一个名为“Age”的新列,到目前为止我有以下代码:

Person['Age'] = map(sum, (Person.ix[0,'TodaysDate']).year, -(Person['YOB']))

TypeError: 'int' object is not iterable

I've also tried:

我也试过:

Person['Age'] = map((Person.ix[0,'TodaysDate']).year - Person['YOB'])

TypeError: map() must have at least two arguments.

I've tried a few different methods that were posted on other questions but none seem to work. This seems very simple to do...but can't get it to work.

我尝试了一些在其他问题上发布的不同方法,但似乎都不起作用。这看起来很简单……但不能让它工作。

Any ideas how I can use the map function to subtract the datetime column TodaysDatefrom the float column YOBto and put the value into Agecolumn? I'd like to do this for every row in the dataframe.

任何想法如何使用 map 函数TodaysDate从浮点列中减去日期时间列YOB并将值放入Age列中?我想对数据框中的每一行执行此操作。

Thank you!

谢谢!

采纳答案by MaxU

Data:

数据:

In [5]: df
Out[5]:
    YOB
0  1955
1  1965
2  1975
3  1985

you don't need an extra column TodaysDate- you can get it dynamically:

您不需要额外的列TodaysDate- 您可以动态获取它:

In [6]: df['Age'] = pd.datetime.now().year - df.YOB

In [7]: df
Out[7]:
    YOB  Age
0  1955   62
1  1965   52
2  1975   42
3  1985   32

Alternatively you can use DataFrame.eval()method:

或者,您可以使用DataFrame.eval()方法:

In [16]: df
Out[16]:
    YOB
0  1955
1  1965
2  1975
3  1985

In [17]: df.eval("Age = @pd.datetime.now().year - YOB", inplace=True)

In [18]: df
Out[18]:
    YOB  Age
0  1955   62
1  1965   52
2  1975   42
3  1985   32

回答by piRSquared

This answer is mostly just propaganda for assign. I'm a fan of assignbecause it returns a new pd.DataFramethat is a copy of the old pd.DataFramewith the additional columns included. In some contexts, returning a new pd.DataFrameis more appropriate. I feel that the syntax is clean and intuitive.

这个答案主要只是宣传assign。我是它的粉丝,assign因为它返回一个新的pd.DataFrame,它是旧的副本,pd.DataFrame包括附加列。在某些情况下,返回一个新pd.DataFrame的更合适。我觉得语法干净直观。

Also, note that I have added zero value in regards to the calculation as I've completely ripped off @MaxU's answer.

另外,请注意,我在计算中添加了零值,因为我已经完全撕掉了@MaxU 的答案。

df.assign(Age=pd.datetime.now().year - df.YOB)

    YOB  Age
0  1955   62
1  1965   52
2  1975   42
3  1985   32