Pandas 重采样:TypeError:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但得到了“RangeIndex”的实例
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51656065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Resampling: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
提问by HeadOverFeet
Please, help me. I want to resample based on 1D. I have following format of data. I want to use resampling in pandas.
请帮我。我想基于 1D 重新采样。我有以下数据格式。我想在Pandas中使用重采样。
I want to resample based on Date and product and also fill the missing values.
我想根据日期和产品重新采样并填充缺失值。
But I keep getting this mistake: I tried like 5 options and mistake only changes after "instance of": I saw there Multiindex, Index.
但是我一直在犯这个错误:我尝试了 5 个选项,但错误仅在“实例”之后发生了变化:我在那里看到了 Multiindex、Index。
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
类型错误:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但获得了“RangeIndex”的实例
product value date
A 1.52 2016-01-01
A NULL 2016-09-20
A 1.33 2018-08-02
B 1.30 2016-01-01
B NULL 2017-01-02
B 1.54 2017-03-10
B 2.08 2017-06-28
B 2.33 2018-08-02
I put these data into
我把这些数据放入
df.reset_index().set_index('date','sku')
df= df.groupby('product').resample('1D')['value'].ffill().bfill().ffill()
I tried also:
我也试过:
df = df.set_index(['date','sku'])
df = df.set_index('date','sku')
df = df.reset_index().set_index(['date','sku'])
Please, can you explain me what I am doing wrong? Thanks!
拜托,你能解释一下我做错了什么吗?谢谢!
Today morning it was working on these data and the command from Jezrael:
今天早上它正在处理这些数据和来自 Jezrael 的命令:
df = df.set_index('date').groupby('product').resample('1D')['value'].ffill()
product value date
0 A 1.52 2016-01-01
1 A NaN 2016-09-20
2 A 1.87 2018-08-02
3 B 2.33 2016-01-01
4 B NaN 2016-09-20
5 B 4.55 2018-08-02
But suddenly it doesnt anymore. Now I have Index in the error statement.
但是突然就没有了。现在我在错误语句中有索引。
回答by jezrael
You need DatetimeIndex
if working with DataFrameGroupBy.resample
, also bfill
is omited because if some only NaN
s groups is possible these data are replaced from another groups:
你需要DatetimeIndex
,如果有工作DataFrameGroupBy.resample
,也bfill
正在被遗漏的,因为如果有的只有NaN
S基团是可能的,这些数据是从另一组取代:
#if necessary convert to datetimes
#df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date').groupby('product').resample('1D')['value'].ffill()
print (df)
product date
A 2016-01-01 1.52
2016-01-02 1.52
2016-01-03 1.52
2016-01-04 1.52
2016-01-05 1.52
2016-01-06 1.52
2016-01-07 1.52
2016-01-08 1.52
2016-01-09 1.52
2016-01-10 1.52
2016-01-11 1.52
2016-01-12 1.52
Changed samplefor better explanation:
更改示例以获得更好的解释:
print (df)
product value date
0 A 1.52 2016-01-01
1 A NaN 2016-01-03
2 B NaN 2017-01-02
3 B NaN 2017-01-03
4 C 1.54 2017-03-10
5 C 2.08 2017-03-12
6 C 2.33 2017-03-14
df1 = df.set_index('date').groupby('product').resample('1D')['value'].ffill()
print (df1)
product date
A 2016-01-01 1.52
2016-01-02 1.52
2016-01-03 NaN < NaN is not changed because in original data
B 2017-01-02 NaN <- only NaN group B
2017-01-03 NaN
C 2017-03-10 1.54
2017-03-11 1.54
2017-03-12 2.08
2017-03-13 2.08
2017-03-14 2.33
Name: value, dtype: float64
df11 = df.set_index('date').groupby('product').resample('1D')['value'].ffill().bfill()
print (df11)
product date
A 2016-01-01 1.52
2016-01-02 1.52
2016-01-03 1.54 <- back filling value from group C
B 2017-01-02 1.54 <- back filling value from group C
2017-01-03 1.54 <- back filling value from group C
C 2017-03-10 1.54
2017-03-11 1.54
2017-03-12 2.08
2017-03-13 2.08
2017-03-14 2.33
Name: value, dtype: float64