pandas Python:降低精度熊猫时间戳数据帧

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32827169/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:56:30  来源:igfitidea点击:

Python: reduce precision pandas timestamp dataframe

pythonpandastimestampdataframe

提问by emax

Hello I have the following dataframe

您好,我有以下数据框

df = 

       Record_ID       Time
        94704   2014-03-10 07:19:19.647342
        94705   2014-03-10 07:21:44.479363
        94706   2014-03-10 07:21:45.479581
        94707   2014-03-10 07:21:54.481588
        94708   2014-03-10 07:21:55.481804

Is it possible to the have following?

是否有可能有以下内容?

df1 = 

       Record_ID       Time
        94704   2014-03-10 07:19:19
        94705   2014-03-10 07:21:44
        94706   2014-03-10 07:21:45
        94707   2014-03-10 07:21:54
        94708   2014-03-10 07:21:55

回答by unutbu

You could convert the underlying datetime64[ns]values to datetime64[s]values using astype:

您可以使用以下方法将基础datetime64[ns]值转换为datetime64[s]astype

In [11]: df['Time'] = df['Time'].astype('datetime64[s]')

In [12]: df
Out[12]: 
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55

Note that since Pandas Series and DataFrames store all datetime values as datetime64[ns]these datetime64[s]values are automatically converted back to datetime64[ns], so the end result is still stored as datetime64[ns]values, but the call to astypecauses the fractional part of the seconds to be removed.

请注意,由于 Pandas Series 和 DataFrames存储所有日期时间值,因为datetime64[ns]这些datetime64[s]值会自动转换回datetime64[ns],因此最终结果仍存储为datetime64[ns]值,但调用 会astype导致秒的小数部分被删除。

If you wish to have a NumPy array of datetime64[s]values, you could use df['Time'].values.astype('datetime64[s]').

如果您希望拥有一个 NumPydatetime64[s]值数组,您可以使用df['Time'].values.astype('datetime64[s]').

回答by Anand S Kumar

If you really must remove the microsecondpart of the datetime, you can use the Timestamp.replacemethod along with Series.applymethod to apply it across the series , to replace the microsecondpart with 0. Example -

如果您确实必须删除microsecond日期时间的一部分,则可以使用该Timestamp.replace方法和Series.apply方法将其应用于整个系列,将microsecond部分替换为0. 例子 -

df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))

Demo -

演示 -

In [25]: df
Out[25]:
   Record_ID                       Time
0      94704 2014-03-10 07:19:19.647342
1      94705 2014-03-10 07:21:44.479363
2      94706 2014-03-10 07:21:45.479581
3      94707 2014-03-10 07:21:54.481588
4      94708 2014-03-10 07:21:55.481804

In [26]: type(df['Time'][0])
Out[26]: pandas.tslib.Timestamp

In [27]: df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))

In [28]: df
Out[28]:
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55

回答by eric R

For pandas of version 0.24.0 or upward, you can simply set the freq parameter in ceil() function to get the precison you want:

对于 0.24.0 或更高版本的 Pandas,您可以简单地在 ceil() 函数中设置 freq 参数以获得您想要的精确度:

df['Time'] = df.Time.dt.ceil(freq='s')  

In [28]: df
Out[28]:
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55