Python 重采样分钟数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14861023/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 12:40:47  来源:igfitidea点击:

Resampling Minute data

pythonpandas

提问by aozkan

I have minute based OHLCV data for the opening range/first hour (9:30-10:30 AM EST). I'm looking to resample this data so I can get one 60-minute value and then calculate the range.

我有开盘范围/第一小时(美国东部时间上午 9:30-10:30)的基于分钟的 OHLCV 数据。我希望重新采样这些数据,以便获得一个 60 分钟的值,然后计算范围。

When I call the dataframe.resample() function on the data I get two rows and the initial row starts at 9:00 AM. I'm looking to get only one row which starts at 9:30 AM.

当我对数据调用 dataframe.resample() 函数时,我得到两行,第一行从上午 9:00 开始。我希望只获得从上午 9:30 开始的一行。

Note: the initial data begins at 9:30.

注:初始数据从 9:30 开始。

enter image description here

在此处输入图片说明

Edit: Adding code:

编辑:添加代码:

# Extract data for regular trading hours (rth) from the 24 hour data set
rth = data.between_time(start_time = '09:30:00', end_time = '16:15:00', include_end = False)

# Extract data for extended trading hours (eth) from the 24 hour data set
eth = data.between_time(start_time = '16:30:00', end_time = '09:30:00', include_end = False)

# Extract data for initial balance (rth) from the 24 hour data set
initial_balance = data.between_time(start_time = '09:30:00', end_time = '10:30:00', include_end =      False)

Got stuck tried to separate the opening range by individual date and get the Initial Balance

卡住了试图按个别日期分开开盘范围并获得初始余额

conversion = {'Open' : 'first', 'High' : 'max', 'Low' : 'min', 'Close' : 'last', 'Volume' : 'sum'}
sample = data.between_time(start_time = '09:30:00', end_time = '10:30:00', include_end = False)
sample = sample.ix['2007-05-07']
sample.tail()

sample.resample('60Min', how = conversion) 

By default resample starts at the beggining of the hour. I would like it to start from where the data starts.

默认情况下,重新采样从小时开始。我希望它从数据开始的地方开始。

采纳答案by Andy Hayden

You can use the baseargument of resample:

您可以使用以下base参数resample

sample.resample('60Min', how=conversion, base=30)

From the above docs-link:

上面的文档链接

base: int, default 0
    For frequencies that evenly subdivide 1 day, the “origin” of the aggregated intervals.
    For example, for ‘5min' frequency, base could range from 0 through 4. Defaults to 0

base: int, default 0
    对于均匀细分 1 天的频率,聚合间隔的“起源”。
    例如,对于“5min”频率,base 的范围可以从 0 到 4。默认为 0