从 python/pandas 中的日期/时间格式计算年龄
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46508895/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculating age from date/time format in python/pandas
提问by Nivi
Looking for a way to calculate age from the following date/time format in python.
在 python 中寻找一种从以下日期/时间格式计算年龄的方法。
eg: 1956-07-01T00:00:00Z
例如:1956-07-01T00:00:00Z
I have written a code to do this by extracting the four characters of the string, convert it to an int and subtract it from 2017 but was looking to see if there is an efficient way to do it.
我编写了一个代码,通过提取字符串的四个字符,将其转换为 int 并从 2017 中减去它来实现这一点,但我正在寻找是否有一种有效的方法来做到这一点。
回答by YOBEN_S
Is this what you want ?
这是你想要的吗 ?
(pd.to_datetime('today').year-pd.to_datetime('1956-07-01').year)
Out[83]: 61
回答by piRSquared
I'd divide the number of days via the timedelta object by 365.25
我将通过 timedelta 对象的天数除以 365.25
(pd.to_datetime('today') - pd.to_datetime('1956-07-01')).days / 365.25
61.24845995893224
回答by Keiku
If there is an irregular year (e.g. 1601) as below, pd.to_datetime
will be an error.
如果有如下不规则年份(例如 1601),pd.to_datetime
将会出错。
import pandas as pd
(pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)
# Traceback (most recent call last):
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 444, in _convert_listlike
# values, tz = tslib.datetime_to_datetime64(arg)
# File "pandas/_libs/tslib.pyx", line 1810, in pandas._libs.tslib.datetime_to_datetime64 (pandas/_libs/tslib.c:33275)
# TypeError: Unrecognized value type: <class 'str'>
# During handling of the above exception, another exception occurred:
# Traceback (most recent call last):
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
# exec(code_obj, self.user_global_ns, self.user_ns)
# File "<ipython-input-45-829e219d9060>", line 1, in <module>
# (pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 518, in to_datetime
# result = _convert_listlike(np.array([arg]), box, format)[0]
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 447, in _convert_listlike
# raise e
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 435, in _convert_listlike
# require_iso8601=require_iso8601
# File "pandas/_libs/tslib.pyx", line 2355, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:46617)
# File "pandas/_libs/tslib.pyx", line 2538, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:45511)
# File "pandas/_libs/tslib.pyx", line 2506, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44978)
# File "pandas/_libs/tslib.pyx", line 2500, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44859)
# File "pandas/_libs/tslib.pyx", line 1517, in pandas._libs.tslib.convert_to_tsobject (pandas/_libs/tslib.c:28598)
# File "pandas/_libs/tslib.pyx", line 1774, in pandas._libs.tslib._check_dts_bounds (pandas/_libs/tslib.c:32752)
# pandas._libs.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1601-07-01 00:00:00
For data including irregular years, you can calculate as follows.
对于包含不规则年份的数据,可以如下计算。
import numpy as np
import pandas as pd
date = pd.Series(['1601-07-01', '1956-07-01'])
def elasped_years(date):
reference_year = pd.to_datetime('today').year
reference_month = pd.to_datetime('today').month
year = date.str.slice(0, 4).astype(np.float)
month = date.str.slice(5, 7).astype(np.float)
duration = np.floor((12 * (reference_year - year) + (reference_month - month)) / 12)
return(duration)
elasped_years(date)
# Out[46]:
# 0 416.0
# 1 61.0
# dtype: float64