Python 中的运行平均值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1790550/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Running average in Python
提问by Nate Kohl
Is there a pythonic way to build up a list that contains a running averageof some function?
有没有一种pythonic方法来构建一个包含某个函数的运行平均值的列表?
After reading a fun little piece about Martians, black boxes, and the Cauchy distribution, I thought it would be fun to calculate a running average of the Cauchy distribution myself:
在阅读了一篇关于火星人、黑匣子和柯西分布的有趣小文章后,我认为自己计算柯西分布的运行平均值会很有趣:
import math
import random
def cauchy(location, scale):
p = 0.0
while p == 0.0:
p = random.random()
return location + scale*math.tan(math.pi*(p - 0.5))
# is this next block of code a good way to populate running_avg?
sum = 0
count = 0
max = 10
running_avg = []
while count < max:
num = cauchy(3,1)
sum += num
count += 1
running_avg.append(sum/count)
print running_avg # or do something else with it, besides printing
I think that this approach works, but I'm curious if there might be a more elegant approach to building up that running_avg
list than using loops and counters (e.g. list comprehensions).
我认为这种方法有效,但我很好奇是否有running_avg
比使用循环和计数器(例如列表理解)更优雅的方法来构建该列表。
There are some related questions, but they address more complicated problems (small window size, exponential weighting) or aren't specific to Python:
有一些相关的问题,但它们解决了更复杂的问题(小窗口大小、指数加权)或者不是 Python 特有的:
回答by orip
You could write a generator:
你可以写一个生成器:
def running_average():
sum = 0
count = 0
while True:
sum += cauchy(3,1)
count += 1
yield sum/count
Or, given a generator for Cauchy numbers and a utility function for a running sum generator, you can have a neat generator expression:
或者,给定一个柯西数生成器和一个运行总和生成器的效用函数,你可以有一个简洁的生成器表达式:
# Cauchy numbers generator
def cauchy_numbers():
while True:
yield cauchy(3,1)
# running sum utility function
def running_sum(iterable):
sum = 0
for x in iterable:
sum += x
yield sum
# Running averages generator expression (** the neat part **)
running_avgs = (sum/(i+1) for (i,sum) in enumerate(running_sum(cauchy_numbers())))
# goes on forever
for avg in running_avgs:
print avg
# alternatively, take just the first 10
import itertools
for avg in itertools.islice(running_avgs, 10):
print avg
回答by Markus Jarderot
You could use coroutines. They are similar to generators, but allows you to send in values. Coroutines was added in Python 2.5, so this won't work in versions before that.
你可以使用协程。它们类似于生成器,但允许您发送值。协程是在 Python 2.5 中添加的,所以这在之前的版本中不起作用。
def running_average():
sum = 0.0
count = 0
value = yield(float('nan'))
while True:
sum += value
count += 1
value = yield(sum/count)
ravg = running_average()
next(ravg) # advance the corutine to the first yield
for i in xrange(10):
avg = ravg.send(cauchy(3,1))
print 'Running average: %.6f' % (avg,)
As a list comprehension:
作为列表理解:
ravg = running_average()
next(ravg)
ravg_list = [ravg.send(cauchy(3,1)) for i in xrange(10)]
Edits:
编辑:
- Using the
next()
function instead of theit.next()
method. This is so it also will work with Python 3. Thenext()
function has also been back-ported to Python 2.6+.
In Python 2.5, you can either replace the calls withit.next()
, or define anext
function yourself.
(Thanks Adam Parkin)
- 使用
next()
函数代替it.next()
方法。因此它也适用于 Python 3。该next()
函数也已向后移植到 Python 2.6+。
在 Python 2.5 中,您可以用 替换调用it.next()
,也可以next
自己定义函数。
(感谢亚当帕金)
回答by Bryan McLemore
I've got two possible solutions here for you. Both are just generic running average functions that work on any list of numbers. (could be made to work with any iterable)
我在这里为您提供了两种可能的解决方案。两者都只是适用于任何数字列表的通用运行平均函数。(可以与任何可迭代对象一起使用)
Generator based:
基于发电机:
nums = [cauchy(3,1) for x in xrange(10)]
def running_avg(numbers):
for count in xrange(1, len(nums)+1):
yield sum(numbers[:count])/count
print list(running_avg(nums))
List Comprehension based (really the same code as the earlier):
基于列表理解(实际上与前面的代码相同):
nums = [cauchy(3,1) for x in xrange(10)]
print [sum(nums[:count])/count for count in xrange(1, len(nums)+1)]
Generator-compatabile Generator based:
发电机兼容发电机基于:
Edit: This one I just tested to see if I could make my solution compatible with generators easily and what it's performance would be. This is what I came up with.
编辑:我刚刚测试了这个,看看我是否可以轻松地使我的解决方案与生成器兼容以及它的性能如何。这就是我想出的。
def running_avg(numbers):
sum = 0
for count, number in enumerate(numbers):
sum += number
yield sum/(count+1)
See the performance stats below, well worth it.
请参阅下面的性能统计数据,非常值得。
Performance characteristics:
性能特点:
Edit: I also decided to test Orip's interesting use of multiple generators to see the impact on performance.
编辑:我还决定测试 Orip 对多个生成器的有趣使用,以查看对性能的影响。
Using timeit and the following (1,000,000 iterations 3 times):
使用 timeit 和以下内容(1,000,000 次迭代 3 次):
print "Generator based:", ', '.join(str(x) for x in Timer('list(running_avg(nums))', 'from __main__ import nums, running_avg').repeat())
print "LC based:", ', '.join(str(x) for x in Timer('[sum(nums[:count])/count for count in xrange(1, len(nums)+1)]', 'from __main__ import nums').repeat())
print "Orip's:", ', '.join(str(x) for x in Timer('list(itertools.islice(running_avgs, 10))', 'from __main__ import itertools, running_avgs').repeat())
print "Generator-compatabile Generator based:", ', '.join(str(x) for x in Timer('list(running_avg(nums))', 'from __main__ import nums, running_avg').repeat())
I get the following results:
我得到以下结果:
Generator based: 17.653908968, 17.8027219772, 18.0342400074
LC based: 14.3925321102, 14.4613749981, 14.4277560711
Orip's: 30.8035550117, 30.3142540455, 30.5146529675
Generator-compatabile Generator based: 3.55352187157, 3.54164409637, 3.59098005295
See comments for code:
代码见注释:
Orip's genEx based: 4.31488609314, 4.29926609993, 4.30518198013
Results are in seconds, and show the LCnew generator-compatible generator method to be consistently faster, your results may vary though. I expect the massive difference between my original generator and the new one is the fact that the sum isn't calculated on the fly.
结果以秒为单位,并且显示LC新的发生器兼容发生器方法始终更快,但您的结果可能会有所不同。我预计我的原始生成器和新生成器之间的巨大差异是总和不是即时计算的。