Pandas Resample 应用自定义函数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41300653/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:40:03  来源:igfitidea点击:

Pandas Resample Apply Custom Function?

pythonpandas

提问by Jason B

I'm trying to use pandas to resample 15 minute periods into 1 hour periods but by applying a custom function. My DataFrame is in this format;

我正在尝试使用Pandas将 15 分钟的时间段重新采样为 1 小时的时间段,但通过应用自定义函数。我的 DataFrame 是这种格式;

Date                      val1       val2                  
2016-01-30 07:00:00       49.0       45.0
2016-01-30 07:15:00       49.0       44.0
2016-01-30 07:30:00       52.0       47.0
2016-01-30 07:45:00       60.0       46.0
2016-01-30 08:00:00       63.0       61.0
2016-01-30 08:15:00       61.0       60.0
2016-01-30 08:30:00       62.0       61.0
2016-01-30 08:45:00       63.0       61.0
2016-01-30 09:00:00       68.0       60.0
2016-01-30 09:15:00       71.0       70.0
2016-01-30 09:30:00       71.0       70.0

..and i want to resample with this function;

..我想用这个功能重新采样;

def log_add(array_like):
    return (10*math.log10((sum([10**(i/10) for i in array_like])))))

I do;

我愿意;

df.resample('1H').apply(log_add)

but this returns an empty df, doing this;

但这会返回一个空的 df,这样做;

df.resample('1H').apply(lambda x: log_add(x))

does the same too. Anyone any ideas why its not applying the function properly?

也一样。任何人都知道为什么它没有正确应用该功能?

Any help would be appreciated, thanks.

任何帮助将不胜感激,谢谢。

回答by jezrael

You can add parameter onwhat is implemented in 0.19.0 pandas:

您可以添加on0.19.0 pandas 中实现的参数:

print (df.resample('1H', on='Date').apply(log_add))

Or set Dateto indexby set_index:

或者设置Dateindexset_index

df.set_index('Date', inplace=True)
print (df.resample('1H').apply(log_add))

Also first check if dtypeof column Dateis datetime, if not use to_datetime:

另外首先检查dtypeDate是否为datetime,如果不使用to_datetime

print (df.dtypes)
Date     object
val1    float64
val2    float64
dtype: object

df.Date = pd.to_datetime(df.Date)

print (df.dtypes)
Date    datetime64[ns]
val1           float64
val2           float64
dtype: object