Pandas 重采样:TypeError:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但得到了“RangeIndex”的实例

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51656065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:52:42  来源:igfitidea点击:

Pandas Resampling: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'

pythonpandas

提问by HeadOverFeet

Please, help me. I want to resample based on 1D. I have following format of data. I want to use resampling in pandas.

请帮我。我想基于 1D 重新采样。我有以下数据格式。我想在Pandas中使用重采样。

I want to resample based on Date and product and also fill the missing values.

我想根据日期和产品重新采样并填充缺失值。

But I keep getting this mistake: I tried like 5 options and mistake only changes after "instance of": I saw there Multiindex, Index.

但是我一直在犯这个错误:我尝试了 5 个选项,但错误仅在“实例”之后发生了变化:我在那里看到了 Multiindex、Index。

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'

类型错误:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但获得了“RangeIndex”的实例

product value   date
A   1.52    2016-01-01
A   NULL    2016-09-20
A   1.33    2018-08-02
B   1.30    2016-01-01
B   NULL    2017-01-02
B   1.54    2017-03-10
B   2.08    2017-06-28
B   2.33    2018-08-02

I put these data into

我把这些数据放入

df.reset_index().set_index('date','sku')  
df= df.groupby('product').resample('1D')['value'].ffill().bfill().ffill()

I tried also:

我也试过:

df = df.set_index(['date','sku'])
df = df.set_index('date','sku')
df = df.reset_index().set_index(['date','sku'])  

Please, can you explain me what I am doing wrong? Thanks!

拜托,你能解释一下我做错了什么吗?谢谢!

Today morning it was working on these data and the command from Jezrael:

今天早上它正在处理这些数据和来自 Jezrael 的命令:

df = df.set_index('date').groupby('product').resample('1D')['value'].ffill()

    product value   date
   0    A   1.52    2016-01-01
   1    A   NaN 2016-09-20 
   2    A   1.87    2018-08-02
   3    B   2.33    2016-01-01
   4    B   NaN 2016-09-20
   5    B   4.55    2018-08-02

But suddenly it doesnt anymore. Now I have Index in the error statement.

但是突然就没有了。现在我在错误语句中有索引。

回答by jezrael

You need DatetimeIndexif working with DataFrameGroupBy.resample, also bfillis omited because if some only NaNs groups is possible these data are replaced from another groups:

你需要DatetimeIndex,如果有工作DataFrameGroupBy.resample,也bfill正在被遗漏的,因为如果有的只有NaNS基团是可能的,这些数据是从另一组取代:

#if necessary convert to datetimes 
#df['date'] = pd.to_datetime(df['date'])

df = df.set_index('date').groupby('product').resample('1D')['value'].ffill()
print (df)
product  date      
A        2016-01-01    1.52
         2016-01-02    1.52
         2016-01-03    1.52
         2016-01-04    1.52
         2016-01-05    1.52
         2016-01-06    1.52
         2016-01-07    1.52
         2016-01-08    1.52
         2016-01-09    1.52
         2016-01-10    1.52
         2016-01-11    1.52
         2016-01-12    1.52

Changed samplefor better explanation:

更改示例以获得更好的解释:

print (df)
  product  value       date
0       A   1.52 2016-01-01
1       A    NaN 2016-01-03
2       B    NaN 2017-01-02
3       B    NaN 2017-01-03
4       C   1.54 2017-03-10
5       C   2.08 2017-03-12
6       C   2.33 2017-03-14


df1 = df.set_index('date').groupby('product').resample('1D')['value'].ffill()
print (df1)
product  date      
A        2016-01-01    1.52
         2016-01-02    1.52
         2016-01-03     NaN < NaN is not changed because in original data
B        2017-01-02     NaN <- only NaN group B
         2017-01-03     NaN
C        2017-03-10    1.54
         2017-03-11    1.54
         2017-03-12    2.08
         2017-03-13    2.08
         2017-03-14    2.33
Name: value, dtype: float64

df11 = df.set_index('date').groupby('product').resample('1D')['value'].ffill().bfill()
print (df11)
product  date      
A        2016-01-01    1.52
         2016-01-02    1.52
         2016-01-03    1.54 <- back filling value from group C
B        2017-01-02    1.54 <- back filling value from group C
         2017-01-03    1.54 <- back filling value from group C
C        2017-03-10    1.54
         2017-03-11    1.54
         2017-03-12    2.08
         2017-03-13    2.08
         2017-03-14    2.33
Name: value, dtype: float64