从 python/pandas 中的日期/时间格式计算年龄

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46508895/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:33:17  来源:igfitidea点击:

Calculating age from date/time format in python/pandas

pandas

提问by Nivi

Looking for a way to calculate age from the following date/time format in python.

在 python 中寻找一种从以下日期/时间格式计算年龄的方法。

eg: 1956-07-01T00:00:00Z

例如:1956-07-01T00:00:00Z

I have written a code to do this by extracting the four characters of the string, convert it to an int and subtract it from 2017 but was looking to see if there is an efficient way to do it.

我编写了一个代码,通过提取字符串的四个字符,将其转换为 int 并从 2017 中减去它来实现这一点,但我正在寻找是否有一种有效的方法来做到这一点。

回答by YOBEN_S

Is this what you want ?

这是你想要的吗 ?

(pd.to_datetime('today').year-pd.to_datetime('1956-07-01').year)

Out[83]: 61

回答by piRSquared

I'd divide the number of days via the timedelta object by 365.25

我将通过 timedelta 对象的天数除以 365.25

(pd.to_datetime('today') - pd.to_datetime('1956-07-01')).days / 365.25

61.24845995893224

回答by Keiku

If there is an irregular year (e.g. 1601) as below, pd.to_datetimewill be an error.

如果有如下不规则年份(例如 1601),pd.to_datetime将会出错。

import pandas as pd

(pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)

# Traceback (most recent call last):
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 444, in _convert_listlike
#     values, tz = tslib.datetime_to_datetime64(arg)
#   File "pandas/_libs/tslib.pyx", line 1810, in pandas._libs.tslib.datetime_to_datetime64 (pandas/_libs/tslib.c:33275)
# TypeError: Unrecognized value type: <class 'str'>
# During handling of the above exception, another exception occurred:
# Traceback (most recent call last):
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
#     exec(code_obj, self.user_global_ns, self.user_ns)
#   File "<ipython-input-45-829e219d9060>", line 1, in <module>
#     (pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 518, in to_datetime
#     result = _convert_listlike(np.array([arg]), box, format)[0]
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 447, in _convert_listlike
#     raise e
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 435, in _convert_listlike
#     require_iso8601=require_iso8601
#   File "pandas/_libs/tslib.pyx", line 2355, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:46617)
#   File "pandas/_libs/tslib.pyx", line 2538, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:45511)
#   File "pandas/_libs/tslib.pyx", line 2506, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44978)
#   File "pandas/_libs/tslib.pyx", line 2500, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44859)
#   File "pandas/_libs/tslib.pyx", line 1517, in pandas._libs.tslib.convert_to_tsobject (pandas/_libs/tslib.c:28598)
#   File "pandas/_libs/tslib.pyx", line 1774, in pandas._libs.tslib._check_dts_bounds (pandas/_libs/tslib.c:32752)
# pandas._libs.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1601-07-01 00:00:00

For data including irregular years, you can calculate as follows.

对于包含不规则年份的数据,可以如下计算。

import numpy as np
import pandas as pd

date = pd.Series(['1601-07-01', '1956-07-01'])

def elasped_years(date):
    reference_year = pd.to_datetime('today').year
    reference_month = pd.to_datetime('today').month
    year = date.str.slice(0, 4).astype(np.float)
    month = date.str.slice(5, 7).astype(np.float)
    duration = np.floor((12 * (reference_year - year) + (reference_month - month)) / 12)
    return(duration)

elasped_years(date)
# Out[46]: 
# 0    416.0
# 1     61.0
# dtype: float64