pandas Python:降低精度熊猫时间戳数据帧
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32827169/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python: reduce precision pandas timestamp dataframe
提问by emax
Hello I have the following dataframe
您好,我有以下数据框
df =
Record_ID Time
94704 2014-03-10 07:19:19.647342
94705 2014-03-10 07:21:44.479363
94706 2014-03-10 07:21:45.479581
94707 2014-03-10 07:21:54.481588
94708 2014-03-10 07:21:55.481804
Is it possible to the have following?
是否有可能有以下内容?
df1 =
Record_ID Time
94704 2014-03-10 07:19:19
94705 2014-03-10 07:21:44
94706 2014-03-10 07:21:45
94707 2014-03-10 07:21:54
94708 2014-03-10 07:21:55
回答by unutbu
You could convert the underlying datetime64[ns]values to datetime64[s]values using astype:
您可以使用以下方法将基础datetime64[ns]值转换为datetime64[s]值astype:
In [11]: df['Time'] = df['Time'].astype('datetime64[s]')
In [12]: df
Out[12]:
Record_ID Time
0 94704 2014-03-10 07:19:19
1 94705 2014-03-10 07:21:44
2 94706 2014-03-10 07:21:45
3 94707 2014-03-10 07:21:54
4 94708 2014-03-10 07:21:55
Note that since Pandas Series and DataFrames store all datetime values as datetime64[ns]these datetime64[s]values are automatically converted back to datetime64[ns], so the end result is still stored as datetime64[ns]values, but the call to astypecauses the fractional part of the seconds to be removed.
请注意,由于 Pandas Series 和 DataFrames存储所有日期时间值,因为datetime64[ns]这些datetime64[s]值会自动转换回datetime64[ns],因此最终结果仍存储为datetime64[ns]值,但调用 会astype导致秒的小数部分被删除。
If you wish to have a NumPy array of datetime64[s]values, you could use df['Time'].values.astype('datetime64[s]').
如果您希望拥有一个 NumPydatetime64[s]值数组,您可以使用df['Time'].values.astype('datetime64[s]').
回答by Anand S Kumar
If you really must remove the microsecondpart of the datetime, you can use the Timestamp.replacemethod along with Series.applymethod to apply it across the series , to replace the microsecondpart with 0. Example -
如果您确实必须删除microsecond日期时间的一部分,则可以使用该Timestamp.replace方法和Series.apply方法将其应用于整个系列,将microsecond部分替换为0. 例子 -
df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))
Demo -
演示 -
In [25]: df
Out[25]:
Record_ID Time
0 94704 2014-03-10 07:19:19.647342
1 94705 2014-03-10 07:21:44.479363
2 94706 2014-03-10 07:21:45.479581
3 94707 2014-03-10 07:21:54.481588
4 94708 2014-03-10 07:21:55.481804
In [26]: type(df['Time'][0])
Out[26]: pandas.tslib.Timestamp
In [27]: df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))
In [28]: df
Out[28]:
Record_ID Time
0 94704 2014-03-10 07:19:19
1 94705 2014-03-10 07:21:44
2 94706 2014-03-10 07:21:45
3 94707 2014-03-10 07:21:54
4 94708 2014-03-10 07:21:55
回答by eric R
For pandas of version 0.24.0 or upward, you can simply set the freq parameter in ceil() function to get the precison you want:
对于 0.24.0 或更高版本的 Pandas,您可以简单地在 ceil() 函数中设置 freq 参数以获得您想要的精确度:
df['Time'] = df.Time.dt.ceil(freq='s')
In [28]: df
Out[28]:
Record_ID Time
0 94704 2014-03-10 07:19:19
1 94705 2014-03-10 07:21:44
2 94706 2014-03-10 07:21:45
3 94707 2014-03-10 07:21:54
4 94708 2014-03-10 07:21:55

