更改 Pandas Dataframe 中的时间频率
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26342713/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Changing time frequency in Pandas Dataframe
提问by Zhubarb
I have a Pandas DataFrame as below.
我有一个 Pandas DataFrame,如下所示。
df
A B
date_time
2014-07-01 06:03:59.614000 62.1250 NaN
2014-07-01 06:03:59.692000 62.2500 NaN
2014-07-01 06:13:34.524000 62.2500 241.0625
2014-07-01 06:13:34.602000 62.2500 241.5000
2014-07-01 06:15:05.399000 62.2500 241.3750
2014-07-01 06:15:05.399000 62.2500 241.2500
2014-07-01 06:15:42.004000 62.2375 241.2500
2014-07-01 06:15:42.082000 62.2375 241.3750
2014-07-01 06:15:42.082000 62.2375 240.2500
I want to change the frequency of this to regular 1 minuteintervals. But get the error below:
我想将此频率更改为定期1 minute间隔。但得到以下错误:
new = df.asfreq('1Min')
>>error: cannot reindex from a duplicate axis
Now, I understand why this is happening. Since my time granularity is high (in milliseconds) but irregular, I get multiple readings per minute, even per second. So I tried to combine these millisecond readings to minutes and get rid of duplicates as below.
现在,我明白为什么会这样了。由于我的时间粒度很高(以毫秒为单位)但不规则,我每分钟甚至每秒都会得到多个读数。因此,我尝试将这些毫秒读数与分钟结合起来,并删除重复项,如下所示。
# try to convert the index to minutes and drop duplicates
df['index'] = df.index
df['minute_index']= df['index'].apply( lambda x: x.strftime('%Y-%m-%d %H:%M'))
df.drop_duplicates(cols = 'minute_index', inplace = True, take_last = True)
df_by_minute = df.set_index('minute_index')
df_by_minute
A B index
minute_index
2014-07-01 06:03 62.2500 NaN 2014-07-01 06:03:59.692000
2014-07-01 06:13 62.2500 241.50 2014-07-01 06:13:34.602000
2014-07-01 06:15 62.2375 240.25 2014-07-01 06:15:42.082000
# now change the frequency to 1 minute but I just get NaNs (!)
df_by_minute.asfreq('1Min')
A B index
2014-07-01 06:03:00 NaN NaN NaT
2014-07-01 06:04:00 NaN NaN NaT
2014-07-01 06:05:00 NaN NaN NaT
2014-07-01 06:06:00 NaN NaN NaT
2014-07-01 06:07:00 NaN NaN NaT
2014-07-01 06:08:00 NaN NaN NaT
2014-07-01 06:09:00 NaN NaN NaT
2014-07-01 06:10:00 NaN NaN NaT
2014-07-01 06:11:00 NaN NaN NaT
2014-07-01 06:12:00 NaN NaN NaT
2014-07-01 06:13:00 NaN NaN NaT
2014-07-01 06:14:00 NaN NaN NaT
2014-07-01 06:15:00 NaN NaN NaT
As you see it does not work.. Can someone help? What I am trying to achieve is to get a function that returns A or B as of DateTimeand DateTime would be in 1Min increments.
如您所见,它不起作用.. 有人可以帮忙吗?我想要实现的是获得一个返回的函数,A or B as of DateTimeDateTime 将以 1Min 为增量。
采纳答案by Jihun
I think, not asfreqbut resamplefits your needs:
我认为,asfreq但不resample符合您的需求:
new = df.resample('T', how='mean')
For howoption, you can also use 'last' or 'first'.
对于how选项,您还可以使用“last”或“first”。

