Pandas 的“扩展窗口”功能是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45370666/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What are Pandas "expanding window" functions?
提问by Deena
Pandas documentation lists a bunch of "expanding window functions" :
Pandas 文档列出了一堆“扩展窗口函数”:
http://pandas.pydata.org/pandas-docs/version/0.17.0/api.html#standard-expanding-window-functions
http://pandas.pydata.org/pandas-docs/version/0.17.0/api.html#standard-expanding-window-functions
But I couldn't figure out what they do from the documentation.
但我无法从文档中弄清楚他们做了什么。
回答by MaxU
You may want to read this Pandas docs:
您可能想阅读此 Pandas 文档:
A common alternative to rolling statistics is to use an expanding window, which yields the value of the statistic with all the data available up to that point in time.
These follow a similar interface to .rolling, with the .expanding method returning an Expanding object.
As these calculations are a special case of rolling statistics, they are implemented in pandas such that the following two calls are equivalent:
滚动统计的一个常见替代方法是使用扩展窗口,它产生统计值以及截至该时间点的所有可用数据。
它们遵循与 .rolling 类似的接口,其中 .expanding 方法返回一个 Expanding 对象。
由于这些计算是滚动统计的特例,因此它们在 Pandas 中实现,因此以下两个调用是等效的:
In [96]: df.rolling(window=len(df), min_periods=1).mean()[:5]
Out[96]:
A B C D
2000-01-01 0.314226 -0.001675 0.071823 0.892566
2000-01-02 0.654522 -0.171495 0.179278 0.853361
2000-01-03 0.708733 -0.064489 -0.238271 1.371111
2000-01-04 0.987613 0.163472 -0.919693 1.566485
2000-01-05 1.426971 0.288267 -1.358877 1.808650
In [97]: df.expanding(min_periods=1).mean()[:5]
Out[97]:
A B C D
2000-01-01 0.314226 -0.001675 0.071823 0.892566
2000-01-02 0.654522 -0.171495 0.179278 0.853361
2000-01-03 0.708733 -0.064489 -0.238271 1.371111
2000-01-04 0.987613 0.163472 -0.919693 1.566485
2000-01-05 1.426971 0.288267 -1.358877 1.808650
回答by Anuj Sharma
To sum up the difference between rolling and expanding function in one line: In rolling function the window size remain constant whereas in the expanding function it changes.
在一行中总结滚动和扩展函数之间的区别:在滚动函数中,窗口大小保持不变,而在扩展函数中,它会发生变化。
Example: Suppose you want to predict the weather, you have 100 days of data:
示例:假设您要预测天气,您有 100 天的数据:
Rolling: let's say window size is 10. For first prediction, it will use (the previous) 10 days of data and predict day 11. For next prediction, it will use the 2nd day (data point) to 11th day of data.
Expanding: For first prediction it will use 10 days of data. However, for second prediction it will use 10 + 1 daysof data. The window has therefore "expanded."
- Window size expands continuously in later method.
滚动:假设窗口大小为 10。对于第一次预测,它将使用(前一个)10 天的数据并预测第 11 天。对于下一次预测,它将使用第 2 天(数据点)到第 11 天的数据。
扩展:对于第一次预测,它将使用 10 天的数据。但是,对于第二次预测,它将使用10 + 1 天的数据。窗口因此“扩大了”。
- 窗口大小在后面的方法中不断扩大。
Code example:
代码示例:
sums = series.expanding(min_periods=2).sum()
series
contains data of number of previously downloaded apps over time series.
Above written code line sum all the number of downloaded apps till that time.
series
包含时间序列中先前下载的应用程序数量的数据。上面编写的代码行总结了到那时为止下载的所有应用程序数量。
Note: min_periods=2
means that we need at least 2 previous data points to aggregate over. Our aggregate here is the sum.
注意:min_periods=2
意味着我们需要至少 2 个先前的数据点来聚合。我们这里的总和是总和。
回答by Deena
Those illustrations from Uber explain the concepts very well:
优步的那些插图很好地解释了这些概念:
Sliding window
滑动窗口
Original article: https://eng.uber.com/omphalos/