pandas 使用 fill_method 重新采样：需要知道从哪一行复制数据？

Question

提问by pvncad

I am trying to use resample method to fill the gaps in timeseries data. But I also want to know which row was used to fill the missed data.

我正在尝试使用 resample 方法来填补时间序列数据中的空白。但我也想知道哪一行被用来填充遗漏的数据。

This is my input series.

这是我的输入系列。

In [28]: data
Out[28]: 
Date
2002-09-09    233.25
2002-09-11    233.05
2002-09-16    230.25
2002-09-18    230.10
2002-09-19    230.05
Name: Price

With resample, I will get this

通过重新采样，我会得到这个

In [29]: data.resample("D", fill_method='bfill')
Out[29]: 
Date
2002-09-09    233.25
2002-09-10    233.05
2002-09-11    233.05
2002-09-12    230.25
2002-09-13    230.25
2002-09-14    230.25
2002-09-15    230.25
2002-09-16    230.25
2002-09-17    230.10
2002-09-18    230.10
2002-09-19    230.05
Freq: D

I am looking for

我在寻找

Out[29]: 
Date
2002-09-09    233.25  2002-09-09
2002-09-10    233.05  2012-09-11
2002-09-11    233.05  2012-09-11
2002-09-12    230.25  2012-09-16
2002-09-13    230.25  2012-09-16
2002-09-14    230.25  2012-09-16
2002-09-15    230.25  2012-09-16
2002-09-16    230.25  2012-09-16
2002-09-17    230.10  2012-09-18  
2002-09-18    230.10  2012-09-18
2002-09-19    230.05  2012-09-19

Any help?

有什么帮助吗？

Answer 1

采纳答案by Garrett

After converting the Seriesto a DataFrame, copy the index into it's own column. (DatetimeIndex.format()is useful here as it returns a string representation of the index, rather than Timestamp/datetime objects.)

转换Series为 a 后DataFrame，将索引复制到它自己的列中。（DatetimeIndex.format()在这里很有用，因为它返回索引的字符串表示，而不是时间戳/日期时间对象。）

In [510]: df = pd.DataFrame(data)

In [511]: df['OrigDate'] = df.index.format()

In [513]: df
Out[513]: 
             Price    OrigDate
Date                          
2002-09-09  233.25  2002-09-09
2002-09-11  233.05  2002-09-11
2002-09-16  230.25  2002-09-16
2002-09-18  230.10  2002-09-18
2002-09-19  230.05  2002-09-19

For resampling without aggregation, there is a helper method asfreq().

对于没有聚合的重采样，有一个辅助方法asfreq()。

In [528]: df.asfreq("D", method='bfill')
Out[528]: 
             Price    OrigDate
2002-09-09  233.25  2002-09-09
2002-09-10  233.05  2002-09-11
2002-09-11  233.05  2002-09-11
2002-09-12  230.25  2002-09-16
2002-09-13  230.25  2002-09-16
2002-09-14  230.25  2002-09-16
2002-09-15  230.25  2002-09-16
2002-09-16  230.25  2002-09-16
2002-09-17  230.10  2002-09-18
2002-09-18  230.10  2002-09-18
2002-09-19  230.05  2002-09-19

This is effectively short-hand for the following, where last()is invoked on the intermediate DataFrameGroupByobjects.

这是以下内容的有效简写， wherelast()在中间DataFrameGroupBy对象上调用。

In [529]: df.resample("D", how='last', fill_method='bfill')
Out[529]: 
             Price    OrigDate
Date                          
2002-09-09  233.25  2002-09-09
2002-09-10  233.05  2002-09-11
2002-09-11  233.05  2002-09-11
2002-09-12  230.25  2002-09-16
2002-09-13  230.25  2002-09-16
2002-09-14  230.25  2002-09-16
2002-09-15  230.25  2002-09-16
2002-09-16  230.25  2002-09-16
2002-09-17  230.10  2002-09-18
2002-09-18  230.10  2002-09-18
2002-09-19  230.05  2002-09-19

pandas 使用 fill_method 重新采样：需要知道从哪一行复制数据？

提问by pvncad

采纳答案by Garrett

相关推荐

最近更新

标签

pandas 使用 fill_method 重新采样：需要知道从哪一行复制数据？

提问by pvncad

采纳答案by Garrett

相关推荐

pandas 如何在熊猫中将两个数据框与不同的列标签相乘？

从 python pandas 中的 DataFrame 中删除特定行

使用 Python Pandas 使用通配符名称搜索对所有列求和

pandas 使用特定的开始时间重新采样每小时的 TimeSeries

相关推荐

最近更新

标签