pandas.DatetimeIndex 频率为 None 且无法设置
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46217529/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas.DatetimeIndex frequency is None and can't be set
提问by clstaudt
I created a DatetimeIndex from a "date" column:
我从“日期”列创建了一个 DatetimeIndex:
sales.index = pd.DatetimeIndex(sales["date"])
Now the index looks as follows:
现在索引如下所示:
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-04', '2003-01-06',
'2003-01-07', '2003-01-08', '2003-01-09', '2003-01-10',
'2003-01-11', '2003-01-13',
...
'2016-07-22', '2016-07-23', '2016-07-24', '2016-07-25',
'2016-07-26', '2016-07-27', '2016-07-28', '2016-07-29',
'2016-07-30', '2016-07-31'],
dtype='datetime64[ns]', name='date', length=4393, freq=None)
As you see, the freq
attribute is None. I suspect that errors down the road are caused by the missing freq
. However, if I try to set the frequency explicitly:
如您所见,该freq
属性为 None。我怀疑后面的错误是由缺少freq
. 但是,如果我尝试明确设置频率:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-148-30857144de81> in <module>()
1 #### DEBUG
----> 2 sales_train = disentangle(df_train)
3 sales_holdout = disentangle(df_holdout)
4 result = sarima_fit_predict(sales_train.loc[5002, 9990]["amount_sold"], sales_holdout.loc[5002, 9990]["amount_sold"])
<ipython-input-147-08b4c4ecdea3> in disentangle(df_train)
2 # transform sales table to disentangle sales time series
3 sales = df_train[["date", "store_id", "article_id", "amount_sold"]]
----> 4 sales.index = pd.DatetimeIndex(sales["date"], freq="d")
5 sales = sales.pivot_table(index=["store_id", "article_id", "date"])
6 return sales
/usr/local/lib/python3.6/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
89 else:
90 kwargs[new_arg_name] = new_arg_value
---> 91 return func(*args, **kwargs)
92 return wrapper
93 return _deprecate_kwarg
/usr/local/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py in __new__(cls, data, freq, start, end, periods, copy, name, tz, verify_integrity, normalize, closed, ambiguous, dtype, **kwargs)
399 'dates does not conform to passed '
400 'frequency {1}'
--> 401 .format(inferred, freq.freqstr))
402
403 if freq_infer:
ValueError: Inferred frequency None from passed dates does not conform to passed frequency D
So apparently a frequency has been inferred, but is stored neither in the freq
nor inferred_freq
attribute of the DatetimeIndex - both are None. Can someone clear up the confusion?
因此很明显,频率已经推断出,但是既没有存储在freq
也没有inferred_freq
了DatetimeIndex的属性-无论是无。有人可以清除混乱吗?
采纳答案by Brad Solomon
You have a couple options here:
你有几个选择:
pd.infer_freq
pd.tseries.frequencies.to_offset
pd.infer_freq
pd.tseries.frequencies.to_offset
I suspect that errors down the road are caused by the missing freq.
我怀疑后面的错误是由缺少的频率引起的。
You are absolutely right. Here's what I use often:
你是绝对正确的。这是我经常使用的:
def add_freq(idx, freq=None):
"""Add a frequency attribute to idx, through inference or directly.
Returns a copy. If `freq` is None, it is inferred.
"""
idx = idx.copy()
if freq is None:
if idx.freq is None:
freq = pd.infer_freq(idx)
else:
return idx
idx.freq = pd.tseries.frequencies.to_offset(freq)
if idx.freq is None:
raise AttributeError('no discernible frequency found to `idx`. Specify'
' a frequency string with `freq`.')
return idx
An example:
一个例子:
idx=pd.to_datetime(['2003-01-02', '2003-01-03', '2003-01-06']) # freq=None
print(add_freq(idx)) # inferred
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype='datetime64[ns]', freq='B')
print(add_freq(idx, freq='D')) # explicit
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'], dtype='datetime64[ns]', freq='D')
Using asfreq
will actually reindex (fill) missing dates, so be careful of that if that's not what you're looking for.
使用asfreq
实际上会重新索引(填充)缺失的日期,所以如果这不是你想要的,请小心。
The primary function for changing frequencies is the
asfreq
function. For aDatetimeIndex
, this is basically just a thin, but convenient wrapper aroundreindex
which generates adate_range
and callsreindex
.
改变频率的主要功能是
asfreq
函数。对于 aDatetimeIndex
,这基本上只是一个薄而方便的包装器,reindex
用于生成 adate_range
和调用reindex
。
回答by JohnE
It seems to relate to missing dates as 3kt notes. You might be able to "fix" with asfreq('D')
as EdChum suggests but that gives you a continuous index with missing data values. It works fine for some some sample data I made up:
它似乎与作为 3kt 音符的缺失日期有关。您可能可以asfreq('D')
按照 EdChum 的建议进行“修复”,但这会为您提供一个缺少数据值的连续索引。它适用于我编写的一些示例数据:
df=pd.DataFrame({ 'x':[1,2,4] },
index=pd.to_datetime(['2003-01-02', '2003-01-03', '2003-01-06']) )
df
Out[756]:
x
2003-01-02 1
2003-01-03 2
2003-01-06 4
df.index
Out[757]: DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-06'],
dtype='datetime64[ns]', freq=None)
Note that freq=None
. If you apply asfreq('D')
, this changes to freq='D'
:
请注意freq=None
。如果您申请asfreq('D')
,这将更改为freq='D'
:
df.asfreq('D')
Out[758]:
x
2003-01-02 1.0
2003-01-03 2.0
2003-01-04 NaN
2003-01-05 NaN
2003-01-06 4.0
df.asfreq('d').index
Out[759]:
DatetimeIndex(['2003-01-02', '2003-01-03', '2003-01-04', '2003-01-05',
'2003-01-06'],
dtype='datetime64[ns]', freq='D')
More generally, and depending on what exactly you are trying to do, you might want to check out the following for other options like reindex & resample: Add missing dates to pandas dataframe
更一般地说,根据您究竟要做什么,您可能需要查看以下其他选项,如重新索引和重新采样: 将缺少的日期添加到Pandas数据框
回答by mrbTT
I'm not sure if earlier versions of python had this, but 3.6 has this simple solution:
我不确定早期版本的 python 是否有这个,但 3.6 有这个简单的解决方案:
# 'b' stands for business days
# 'w' for weekly, 'd' for daily, and you get the idea...
df.index.freq = 'b'
回答by Riz.Khan
I am not sure but I was having the same error. I was not able to resolve my issue by suggestions posted above but solved it using the below solution.
我不确定,但我遇到了同样的错误。我无法通过上面发布的建议解决我的问题,但使用以下解决方案解决了它。
Pandas DatetimeIndex + seasonal_decompose = missing frequency.
Pandas DatetimeIndex + season_decompose = 缺失频率。
Best Regards
此致