Python 重采样分钟数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14861023/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Resampling Minute data
提问by aozkan
I have minute based OHLCV data for the opening range/first hour (9:30-10:30 AM EST). I'm looking to resample this data so I can get one 60-minute value and then calculate the range.
我有开盘范围/第一小时(美国东部时间上午 9:30-10:30)的基于分钟的 OHLCV 数据。我希望重新采样这些数据,以便获得一个 60 分钟的值,然后计算范围。
When I call the dataframe.resample() function on the data I get two rows and the initial row starts at 9:00 AM. I'm looking to get only one row which starts at 9:30 AM.
当我对数据调用 dataframe.resample() 函数时,我得到两行,第一行从上午 9:00 开始。我希望只获得从上午 9:30 开始的一行。
Note: the initial data begins at 9:30.
注:初始数据从 9:30 开始。


Edit: Adding code:
编辑:添加代码:
# Extract data for regular trading hours (rth) from the 24 hour data set
rth = data.between_time(start_time = '09:30:00', end_time = '16:15:00', include_end = False)
# Extract data for extended trading hours (eth) from the 24 hour data set
eth = data.between_time(start_time = '16:30:00', end_time = '09:30:00', include_end = False)
# Extract data for initial balance (rth) from the 24 hour data set
initial_balance = data.between_time(start_time = '09:30:00', end_time = '10:30:00', include_end = False)
Got stuck tried to separate the opening range by individual date and get the Initial Balance
卡住了试图按个别日期分开开盘范围并获得初始余额
conversion = {'Open' : 'first', 'High' : 'max', 'Low' : 'min', 'Close' : 'last', 'Volume' : 'sum'}
sample = data.between_time(start_time = '09:30:00', end_time = '10:30:00', include_end = False)
sample = sample.ix['2007-05-07']
sample.tail()
sample.resample('60Min', how = conversion)
By default resample starts at the beggining of the hour. I would like it to start from where the data starts.
默认情况下,重新采样从小时开始。我希望它从数据开始的地方开始。
采纳答案by Andy Hayden
You can use the baseargument of resample:
您可以使用以下base参数resample:
sample.resample('60Min', how=conversion, base=30)
From the above docs-link:
从上面的文档链接:
base:int, default 0
For frequencies that evenly subdivide 1 day, the “origin” of the aggregated intervals.
For example, for ‘5min' frequency, base could range from 0 through 4. Defaults to 0
base:int, default 0
对于均匀细分 1 天的频率,聚合间隔的“起源”。
例如,对于“5min”频率,base 的范围可以从 0 到 4。默认为 0

