Python Pandas 数据框 - 任何以编程方式设置频率的方法？

Question

提问by birone

I'm trying to process CSV files like this:

我正在尝试处理这样的 CSV 文件：

df = pd.read_csv("raw_hl.csv", index_col='time', parse_dates = True))
df.head(2)
                    high        low 
time                
2014-01-01 17:00:00 1.376235    1.375945
2014-01-01 17:01:00 1.376005    1.375775
2014-01-01 17:02:00 1.375795    1.375445
2014-01-01 17:07:00 NaN         NaN 
...
2014-01-01 17:49:00 1.375645    1.375445

type(df.index)
pandas.tseries.index.DatetimeIndex

But these don't automatically have a frequency:

但是这些不会自动具有频率：

print df.index.freq
None

In case they have differing frequencies, it would be handy to be able to set one automatically. The simplest way would be to compare the first two rows:

如果它们有不同的频率，能够自动设置一个会很方便。最简单的方法是比较前两行：

tdelta = df.index[1] - df.index[0]
tdelta
datetime.timedelta(0, 60)

So far so good, but setting frequency directly to this timedelta fails:

到目前为止一切顺利，但将频率直接设置为此 timedelta 失败：

df.index.freq = tdelta
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-25-3f24abacf9de> in <module>()
----> 1 df.index.freq = tdelta

AttributeError: can't set attribute

Is there a way (ideally relatively painless!) to do this?

有没有办法（理想情况下相对无痛！）来做到这一点？

ANSWER: Pandas has given the dataframe has a index.inferred_freq attribute - perhaps to avoid overwriting a user defined frequency. df.index.inferred_freq = 'T'

回答：Pandas 已经给数据帧提供了一个 index.inferred_freq 属性 - 也许是为了避免覆盖用户定义的频率。df.index.inferred_freq = 'T'

So it just seems to be a matter of using this instead of df.index.freq. Thanks to Jeff, who also provides more details below :)

所以这似乎只是使用它而不是 df.index.freq 的问题。感谢杰夫，他还在下面提供了更多详细信息:)

Answer 1

回答by Jeff

If you havea regular frequency it will be reported when you look at df.index.freq

如果你有规律的频率它会在你看的时候报告df.index.freq

In [20]: df = DataFrame({'A' : np.arange(5)},index=pd.date_range('20130101 09:00:00',freq='3T',periods=5))

In [21]: df
Out[21]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:06:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

In [22]: df.index.freq
Out[22]: <3 * Minutes>

Have an irregularfrequency will return None

有不规律的频率会回来None

In [23]: df.index = df.index[0:2].tolist() + [Timestamp('20130101 09:05:00')] + df.index[-2:].tolist()

In [24]: df
Out[24]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:05:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

In [25]: df.index.freq

You can recover a regular frequency by doing this. Downsampling to a lower freq (where you don't have overlapping values), forward filling, then reindexing to the desired frequency and end-points).

您可以通过这样做恢复正常频率。下采样到较低的频率（您没有重叠值），向前填充，然后重新索引到所需的频率和端点）。

In [31]: df.resample('T').ffill().reindex(pd.date_range(df.index[0],df.index[-1],freq='3T'))
Out[31]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:06:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

Python Pandas 数据框 - 任何以编程方式设置频率的方法？

提问by birone

回答by Jeff

相关推荐

最近更新

标签

Python Pandas 数据框 - 任何以编程方式设置频率的方法？

提问by birone

回答by Jeff

相关推荐

pandas 按多列对数据框进行分组并将结果附加到数据框

Python Pandas 在 X 上线性插值 Y

如何使用 Statsmodels 库从 Pandas 数据框创建马赛克图？

如何在 Pandas 中选择“本月的最后一个工作日”？

相关推荐

最近更新

标签