在 C++ 中计算滚动/移动平均值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10990618/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 14:43:16  来源:igfitidea点击:

Calculate rolling / moving average in C++

c++boostmoving-average

提问by goji

I know this is achievable with boost as per:

我知道这可以通过 boost 实现:

Using boost::accumulators, how can I reset a rolling window size, does it keep extra history?

使用 boost::accumulators,如何重置滚动窗口大小,它是否保留了额外的历史记录?

But I really would like to avoid using boost. I have googled and not found any suitable or readable examples.

但我真的很想避免使用boost。我用谷歌搜索并没有找到任何合适或可读的例子。

Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.

基本上,我想使用最近的 1000 个数字作为数据样本来跟踪浮点数流的持续流的移动平均值。

What is the easiest way to achieve this?

实现这一目标的最简单方法是什么?



I experimented with using a circular array, exponential moving average and a more simple moving average and found that the results from the circular array suited my needs best.

我尝试使用圆形阵列、指数移动平均线和更简单的移动平均线,发现圆形阵列的结果最适合我的需要。

采纳答案by Karthik Kumar Viswanathan

You simply need a circular array of 1000 elements, where you add the element to the previous element and store it... It becomes an increasing sum, where you can always get the sum between any two pairs of elements, and divide by the number of elements between them, to yield the average.

您只需要一个包含 1000 个元素的圆形数组,在其中将元素添加到前一个元素并将其存储......它变成了一个递增的总和,您始终可以得到任意两对元素之间的总和,然后除以数字它们之间的元素,以产生平均值。

回答by steveha

If your needs are simple, you might just try using an exponential moving average.

如果您的需求很简单,您可以尝试使用指数移动平均线。

http://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average

http://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average

Put simply, you make an accumulator variable, and as your code looks at each sample, the code updates the accumulator with the new value. You pick a constant "alpha" that is between 0 and 1, and compute this:

简而言之,您创建了一个累加器变量,当您的代码查看每个样本时,代码会使用新值更新累加器。您选择一个介于 0 和 1 之间的常量“alpha”,然后计算:

accumulator = (alpha * new_value) + (1.0 - alpha) * accumulator

You just need to find a value of "alpha" where the effect of a given sample only lasts for about 1000 samples.

您只需要找到一个“alpha”值,其中给定样本的效果仅持续大约 1000 个样本。

Hmm, I'm not actually sure this is suitable for you, now that I've put it here. The problem is that 1000 is a pretty long window for an exponential moving average; I'm not sure there is an alpha that would spread the average over the last 1000 numbers, without underflow in the floating point calculation. But if you wanted a smaller average, like 30 numbers or so, this is a very easy and fast way to do it.

嗯,我不确定这是否适合你,现在我已经把它放在这里了。问题是 1000 是指数移动平均线的一个相当长的窗口;我不确定是否有一个 alpha 可以将平均值分布在最后 1000 个数字上,而不会在浮点计算中出现下溢。但是如果你想要一个更小的平均值,比如 30 个左右的数字,这是一个非常简单快捷的方法。

回答by Tony Delroy

Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.

基本上,我想使用最近的 1000 个数字作为数据样本来跟踪浮点数流的持续流的移动平均值。

Note that the below updates the total_as elements as added/replaced, avoiding costly O(N) traversal to calculate the sum - needed for the average - on demand.

请注意,下面将total_as 元素更新为添加/替换,避免了成本高昂的O(N) 遍历来计算总和 - 平均所需 - 按需。

template <typename T, typename Total, size_t N>
class Moving_Average
{
  public:
    void operator()(T sample)
    {
        if (num_samples_ < N)
        {
            samples_[num_samples_++] = sample;
            total_ += sample;
        }
        else
        {
            T& oldest = samples_[num_samples_++ % N];
            total_ += sample - oldest;
            oldest = sample;
        }
    }

    operator double() const { return total_ / std::min(num_samples_, N); }

  private:
    T samples_[N];
    size_t num_samples_{0};
    Total total_{0};
};

Totalis made a different parameter from Tto support e.g. using a long longwhen totalling 1000 longs, an intfor chars, or a doubleto total floats.

TotalTto support不同的参数,例如,long long在总计 1000long秒时使用 a ,intchars使用an ,或double对总计floats 使用 a。

Issues

问题

This is a bit flawed in that num_samples_could conceptually wrap back to 0, but it's hard to imagine anyone having 2^64 samples: if concerned, use an extra bool data member to record when the container is first filled while cycling num_samples_around the array (best then renamed something innocuous like "pos").

这有点缺陷,因为它num_samples_可以在概念上回num_samples_绕到0,但很难想象任何人都有 2^64 个样本:如果担心,请使用额外的 bool 数据成员来记录容器在数组中循环时首次填充的时间(最好然后将一些无害的东西重命名为“ pos”)。

Another issue is inherent with floating point precision, and can be illustrated with a simple scenario for T=double, N=2: we start with total_ = 0, then inject samples...

另一个问题是浮点精度固有的,可以用 T=double, N=2 的简单场景来说明:我们从 开始total_ = 0,然后注入样本...

  • 1E17, we execute total_ += 1E17, so total_ == 1E17, then inject

  • 1, we execute total += 1, but total_ == 1E17still, as the "1" is too insignificant to change the 64-bit doublerepresentation of a number as large as 1E17, then we inject

  • 2, we execute total += 2 - 1E17, in which 2 - 1E17is evaluated first and yields -1E17as the 2 is lost to imprecision/insignificance, so to our total of 1E17 we add -1E17 and total_becomes 0, despite current samples of 1 and 2 for which we'd want total_to be 3. Our moving average will calculate 0 instead of 1.5. As we add another sample, we'll subtract the "oldest" 1 from total_despite it never having been properly incorporated therein; our total_and moving averages are likely to remain wrong.

  • 1E17,我们执行total_ += 1E17,所以total_ == 1E17,然后注入

  • 1,我们执行total += 1,但total_ == 1E17仍然,因为“1”太微不足道了,无法改变double像 1E17 这样大的数字的 64 位表示,然后我们注入

  • 2,我们执行total += 2 - 1E17,其中2 - 1E17首先计算和产量-1E17为2输给了不精确/渺小,所以我们总的1E17我们添加-1E17和total_,变为0,尽管为此我们会想的1电流采样2total_至是 3。我们的移动平均线将计算为 0 而不是 1.5。当我们添加另一个样本时,我们将从中减去“最旧的”1,total_尽管它从未被正确地合并到其中;我们的total_均线和移动平均线很可能仍然是错误的。

You could add code that stores the highest recent total_and if the current total_is too small a fraction of that (a template parameter could provide a multiplicative threshold), you recalculate the total_from all the samples in the samples_array (and set highest_recent_total_to the new total_), but I'll leave that to the reader who cares sufficiently.

您可以添加存储最近最高的代码,total_如果当前total_值太小(模板参数可以提供乘法阈值),您可以total_samples_数组中的所有样本重新计算(并设置highest_recent_total_为新的total_),但是我会把它留给足够关心的读者。

回答by jxh

You can approximate a rolling average by applying a weighted average on your input stream.

您可以通过对输入流应用加权平均值来近似滚动平均值。

template <unsigned N>
double approxRollingAverage (double avg, double input) {
    avg -= avg/N;
    avg += input/N;
    return avg;
}

This way, you don't need to maintain 1000 buckets. However, it is an approximation, so it's value will not match exactly with a true rolling average.

这样,您就不需要维护 1000 个存储桶。然而,它是一个近似值,因此它的值不会与真正的滚动平均值完全匹配。

Edit: Just noticed @steveha's post. This is equivalent to the exponential moving average, with the alpha being 1/N (I was taking N to be 1000 in this case to simulate 1000 buckets).

编辑:刚刚注意到@steveha 的帖子。这相当于指数移动平均线,alpha 为 1/N(在这种情况下,我将 N 设为 1000 来模拟 1000 个桶)。

回答by Erik Aronesty

Simple class to calculate rolling average and also rolling standard deviation:

计算滚动平均值和滚动标准偏差的简单类:

#define _stdev(cnt, sum, ssq) sqrt((((double)(cnt))*ssq-pow((double)(sum),2)) / ((double)(cnt)*((double)(cnt)-1)))

class moving_average {
private:
    boost::circular_buffer<int> *q;
    double sum;
    double ssq;
public:
    moving_average(int n)  {
        sum=0;
        ssq=0;
        q = new boost::circular_buffer<int>(n);
    }
    ~moving_average() {
        delete q;
    }
    void push(double v) {
        if (q->size() == q->capacity()) {
            double t=q->front();
            sum-=t;
            ssq-=t*t;
            q->pop_front();
        }
        q->push_back(v);
        sum+=v;
        ssq+=v*v;
    }
    double size() {
        return q->size();
    }
    double mean() {
        return sum/size();
    }
    double stdev() {
        return _stdev(size(), sum, ssq);
    }

};

回答by baumann

I use this quite often in hard realtime systems that have fairly insane update rates (50kilosamples/sec) As a result I typically precompute the scalars.

我经常在具有相当疯狂的更新率(50 千样本/秒)的硬实时系统中使用它,因此我通常会预先计算标量。

To compute a moving average of N samples: scalar1 = 1/N; scalar2 = 1 - scalar1; // or (1 - 1/N) then:

计算 N 个样本的移动平均值:scalar1 = 1/N;标量 2 = 1 - 标量 1;// 或 (1 - 1/N) 然后:

Average = currentSample*scalar1 + Average*scalar2;

平均值 = currentSample*scalar1 + 平均值*scalar2;

Example: Sliding average of 10 elements

示例:10 个元素的滑动平均值

double scalar1 = 1.0/10.0;  // 0.1
double scalar2 = 1.0 - scalar1; // 0.9
bool first_sample = true;
double average=0.0;
while(someCondition)
{
   double newSample = getSample();
   if(first_sample)
   {
    // everybody forgets the initial condition *sigh*
      average = newSample;
      first_sample = false;
   }
   else
   {
      average = (sample*scalar1) + (average*scalar2);
   }
 }

Note: this is just a practical implementation of the answer given by steveha above. Sometimes it's easier to understand a concrete example.

注意:这只是上面 steveha 给出的答案的实际实现。有时更容易理解一个具体的例子。

回答by Tim

You could implement a ring buffer. Make an array of 1000 elements, and some fields to store the start and end indexes and total size. Then just store the last 1000 elements in the ring buffer, and recalculate the average as needed.

你可以实现一个环形缓冲区。制作一个包含 1000 个元素的数组,以及一些用于存储开始和结束索引以及总大小的字段。然后只需将最后 1000 个元素存储在环形缓冲区中,并根据需要重新计算平均值。

回答by Nilesh Kumar Jha

One way can be to circularly store the values in the buffer array. and calculate average this way.

一种方法是将值循环存储在缓冲区数组中。并以这种方式计算平均值。

int j = (int) (counter % size);
buffer[j] = mostrecentvalue;
avg = (avg * size - buffer[j - 1 == -1 ? size - 1 : j - 1] + buffer[j]) / size;

counter++;

// buffer[j - 1 == -1 ? size - 1 : j - 1] is the oldest value stored

The whole thing runs in a loop where most recent value is dynamic.

整个过程在一个循环中运行,其中最新的值是动态的。

回答by Pedro Soares

a simple moving average for 10 items, using a list:

使用列表的 10 个项目的简单移动平均值:

#include <list>

std::list<float> listDeltaMA;

float getDeltaMovingAverage(float delta)
{
    listDeltaMA.push_back(delta);
    if (listDeltaMA.size() > 10) listDeltaMA.pop_front();
    float sum = 0;
    for (std::list<float>::iterator p = listDeltaMA.begin(); p != listDeltaMA.end(); ++p)
        sum += (float)*p;
    return sum / listDeltaMA.size();
}