Pandas:时间戳索引四舍五入到最接近的第 5 分钟

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24479577/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:12:26  来源:igfitidea点击:

Pandas: Timestamp index rounding to the nearest 5th minute

pythonpandas

提问by Plug4

I have a dfwith the usual timestamps as an index:

我有一个df通常的时间戳作为索引:

    2011-04-01 09:30:00
    2011-04-01 09:30:10
    ...
    2011-04-01 09:36:20
    ...
    2011-04-01 09:37:30

How can I create a column to this dataframe with the same timestamp but rounded to the nearest 5th minute interval? Like this:

如何使用相同的时间戳为该数据帧创建一列但四舍五入到最接近的第 5 分钟间隔?像这样:

    index                 new_col
    2011-04-01 09:30:00   2011-04-01 09:35:00        
    2011-04-01 09:30:10   2011-04-01 09:35:00
    2011-04-01 09:36:20   2011-04-01 09:40:00
    2011-04-01 09:37:30   2011-04-01 09:40:00

回答by cronos

The round_to_5min(t)solution using timedeltaarithmeticis correct but complicated and very slow. Instead make use of the nice Timstampin pandas:

round_to_5min(t)使用timedelta算术的解决方案是正确的,但复杂且非常缓慢。而是使用Timstamppandas中的 nice :

import numpy as np
import pandas as pd

ns5min=5*60*1000000000   # 5 minutes in nanoseconds 
pd.to_datetime(((df.index.astype(np.int64) // ns5min + 1 ) * ns5min))

Let's compare the speed:

我们来比较一下速度:

rng = pd.date_range('1/1/2014', '1/2/2014', freq='S')

print len(rng)
# 86401

# ipython %timeit 
%timeit pd.to_datetime(((rng.astype(np.int64) // ns5min + 1 ) * ns5min))
# 1000 loops, best of 3: 1.01 ms per loop

%timeit rng.map(round_to_5min)
# 1 loops, best of 3: 1.03 s per loop

Just about 1000 times faster!

大约快 1000 倍!

回答by dustyrockpyle

You can try something like this:

你可以尝试这样的事情:

def round_to_5min(t):
    delta = datetime.timedelta(minutes=t.minute%5, 
                               seconds=t.second, 
                               microseconds=t.microsecond)
    t -= delta
    if delta > datetime.timedelta(0):
        t += datetime.timedelta(minutes=5)
    return t

df['new_col'] = df.index.map(round_to_5min)

回答by Guido

One could easily use the round function of pandas

可以轻松使用pandas的round函数

df["timestamp_column"].dt.round("5min")

Check herefor more details

点击这里了解更多详情

回答by ShaharA

I had the same problem but with datetime64p[ns] timestamps.

我有同样的问题,但 datetime64p[ns] 时间戳。

I used:

我用了:

def round_to_5min(t):
    """ This function rounds a timedelta timestamp to the nearest 5-min mark"""
    t = datetime.datetime(t.year, t.month, t.day, t.hour, t.minute - t.minute%5, 0)  
    return t

followed by the the 'map' function

其次是“地图”功能