Pandas 滚动窗口 - datetime64[ns] 未实现

Question

提问by David Crook

I'm attempting to use Python/Pandas to build some charts. I have data that is sampled every second. Here is a sample:

我正在尝试使用 Python/Pandas 来构建一些图表。我有每秒采样的数据。这是一个示例：

Index, Time, Value

31362, 1975-05-07 07:59:18,  36.151612
31363, 1975-05-07 07:59:19,  36.181368
31364, 1975-05-07 07:59:20,  36.197195
31365, 1975-05-07 07:59:21,  36.151413
31366, 1975-05-07 07:59:22,  36.138009
31367, 1975-05-07 07:59:23,  36.142962
31368, 1975-05-07 07:59:24,  36.122680

I need to create a variety of windows to look at the data. 10, 100, 1000 etc. Unfortunately when I attempt to window the entire data frame I get the error below...

我需要创建各种窗口来查看数据。10, 100, 1000 等等。不幸的是，当我尝试对整个数据框进行窗口化时，出现以下错误...

NotImplementedError: ops for Rolling for this dtype datetime64[ns] are not implemented

I checked out these docs: http://pandas.pydata.org/pandas-docs/stable/computation.htmlas a reference, and they appear to be doing this on date ranges. I did notice that the data type between what they have and what I have is different.

我查看了这些文档：http: //pandas.pydata.org/pandas-docs/stable/computation.html作为参考，他们似乎在日期范围内这样做。我确实注意到他们拥有的和我拥有的数据类型不同。

Is there an easy way to do this?

是否有捷径可寻？

This is ideally what I'm trying to do:

理想情况下，这就是我想要做的：

tmp = data.rolling(window=2)
tmp.mean()

I'm using plotly to plot the raw data and then the windowed data on top of it. My goal is to find ideal windows for identifying cleaner trends in the data removing some of the noise.

我正在使用 plotly 绘制原始数据，然后在其上绘制窗口数据。我的目标是找到理想的窗口，以识别数据中的更清晰趋势并消除一些噪音。

Thanks!

谢谢！

Additional Notes:

补充说明：

I think I need to take my data from this format:

我想我需要从这种格式中获取我的数据：

pandas.core.series.Series to this one:

pandas.core.series.Series 到这个：

pandas.tseries.index.DatetimeIndex

Answer 1

回答by piRSquared

Setup

设置

from StringIO import StringIO
import pandas as pd

text = """Index,Time,Value
31362,1975-05-07 07:59:18,36.151612
31363,1975-05-07 07:59:19,36.181368
31364,1975-05-07 07:59:20,36.197195
31365,1975-05-07 07:59:21,36.151413
31366,1975-05-07 07:59:22,36.138009
31367,1975-05-07 07:59:23,36.142962
31368,1975-05-07 07:59:24,36.122680"""

df = pd.read_csv(StringIO(text), index_col=0, parse_dates=[1])

df.rolling(2).mean()

NotImplementedError: ops for Rolling for this dtype datetime64[ns] are not implemented

NotImplementedError: ops for Rolling for this dtype datetime64[ns] are not implemented

First off, this is confirmation of @BrenBarn's comment and he should get the credit if he decides to post an answer. BrenBarn, if you decide to answer, I'll delete this post.

首先，这是对@BrenBarn 评论的确认，如果他决定发布答案，他应该得到赞扬。BrenBarn，如果你决定回答，我会删除这篇文章。

Explanation

解释

Pandas has no idea what a rolling mean of date values ought to be. df.rolling(2).mean()is attempting to roll and average over both the Timeand Valuecolumns. The error is politely (or impolitely, depending on your perspective) telling you that you're trying something non-sensical.

Pandas 不知道日期值的滚动平均值应该是什么。 df.rolling(2).mean()正在尝试对Time和Value列进行滚动和平均。该错误是礼貌地（或不礼貌地，取决于您的观点）告诉您您正在尝试一些毫无意义的事情。

Solution

解决方案

Move the Timecolumn to the index and then... well that's it.

将Time列移动到索引，然后......好吧，就是这样。

df.set_index('Time').rolling(2).mean()

Pandas 滚动窗口 - datetime64[ns] 未实现

提问by David Crook

回答by piRSquared

Setup

设置

Explanation

解释

Solution

解决方案

相关推荐

最近更新

标签

Pandas 滚动窗口 - datetime64[ns] 未实现

提问by David Crook

回答by piRSquared

Setup

设置

Explanation

解释

Solution

解决方案

相关推荐

无法通过 python pandas 计算 MACD

pandas 两个数据点之间的线性插值

pandas ValueError：项目错误长度 907 而不是 2000

pandas 无法从熊猫数据框中删除一列

相关推荐

最近更新

标签