在 C++ 中计算标准偏差和方差

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33268513/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 14:14:34  来源:igfitidea点击:

Calculating Standard Deviation & Variance in C++

c++arraysaveragevariancestandard-deviation

提问by Hyman

so i've posted a few times and previously my problems were pretty vague

所以我已经发布了几次,以前我的问题很模糊

i started C++ this week and have been doing a little project

我这周开始使用 C++ 并且一直在做一个小项目

so i'm trying to calc standard deviation & variance

所以我正在尝试计算标准偏差和方差

my code loads a file of 100 integers and put them into an array, counts them, calcs mean, sum, var and sd

我的代码加载了一个包含 100 个整数的文件并将它们放入一个数组中,对它们进行计数,计算均值、总和、var 和 sd

but i'm having a little trouble with variance

但我在方差方面遇到了一些麻烦

i keep getting a huge number - i have a feeling it's to do with its calculation

我一直得到一个巨大的数字 - 我觉得这与它的计算有关

my mean and sum are ok

我的平均值和总和没问题

any help or tips?

任何帮助或提示?

NB:

注意:

sd & mean calcs

sd 和平均计算

Cheers,

干杯,

Hyman

Hyman

 using namespace std;
    int main()

{

int n = 0;
int Array[100];
float mean;
float var;
float sd;
string line;
float numPoints;

ifstream myfile(“numbers.txt");

if (myfile.is_open())

{
    while (!myfile.eof())

    {
        getline(myfile, line);

        stringstream convert(line);

        if (!(convert >> Array[n]))

        {
            Array[n] = 0;
        }
        cout << Array[n] << endl;

        n++;

    }

    myfile.close();

    numPoints = n;

}
else cout<< "Error loading file" <<endl;

int sum = accumulate(begin(Array), end(Array), 0, plus<int>());

cout << "The sum of all integers: " << sum << endl;

mean = sum/numPoints;

cout << "The mean of all integers: " << mean <<endl;

var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;

sd = sqrt(var);

cout << "The standard deviation is: " << sd <<endl;

return 0;

}

回答by Ahmed Akhtar

As the other answer by horseshoe correctly suggests, you will have to use a loop to calculate variance otherwise the statement

正如马蹄铁的另一个答案正确暗示的那样,您必须使用循环来计算方差,否则语句

var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;

var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;

will just consider a single element from the array.

将只考虑数组中的一个元素。

Just improved horseshoe's suggested code:

刚刚改进了马蹄铁的建议代码:

var = 0;
for( n = 0; n < numPoints; n++ )
{
  var += (Array[n] - mean) * (Array[n] - mean);
}
var /= numPoints;
sd = sqrt(var);

Your sum works fine even without using loop because you are using accumulatefunction which already has a loop inside it, but which is not evident in the code, take a look at the equivalent behavior of accumulatefor a clear understanding of what it is doing.

即使不使用循环,您的 sum 也能正常工作,因为您使用的累积函数内部已经有一个循环,但在代码中并不明显,请查看累积的等效行为,以便清楚地了解它在做什么。

Note:X ?= Yis short for X = X ? Ywhere ?can be any operator. Also you can use pow(Array[n] - mean, 2)to take the square instead of multiplying it by itself making it more tidy.

注意:X ?= YX = X ? Ywhere?可以是任何运算符的缩写。您也可以使用pow(Array[n] - mean, 2)正方形而不是将其乘以本身使其更整洁。

回答by rayryeng

Here's another approach using std::accumulatebut without using pow. In addition, we can use an anonymous function to define how to calculate the variance after we calculate the mean. Note that this computes the unbiased sample variance.

这是使用std::accumulate但不使用pow. 另外,我们可以使用匿名函数来定义我们计算均值后如何计算方差。请注意,这会计算无偏样本方差。

#include <vector>
#include <algorithm>
#include <numeric>

template<typename T>
T variance(const std::vector<T> &vec)
{
    size_t sz = vec.size();
    if (sz == 1)
        return 0.0;

    // Calculate the mean
    T mean = std::accumulate(vec.begin(), vec.end(), 0.0) / sz;

    // Now calculate the variance
    auto variance_func = [&mean, &sz](T accumulator, const T& val)
    {
        return accumulator + ((val - mean)*(val - mean) / (sz - 1));
    };

    return std::accumulate(vec.begin(), vec.end(), 0.0, variance_func);
}

A sample of how to use this function:

如何使用此功能的示例:

int main()
{
    std::vector<double> vec = {1.0, 5.0, 6.0, 3.0, 4.5};
    std::cout << variance(vec) << std::endl;
}

回答by horseshoe

Your variance calculation is outside the loop and thus it is only based on the n== 100 value.You need an additional loop.

您的方差计算在循环之外,因此它仅基于n== 100 value.您需要额外的循环。

You need:

你需要:

var = 0;
n=0;
while (n<numPoints){
   var = var + ((Array[n] - mean) * (Array[n] - mean));
   n++;
}
var /= numPoints;
sd = sqrt(var);

回答by D.Zadravec

Two simple methods to calculate Standard Deviation & Variance in C++.

在 C++ 中计算标准偏差和方差的两种简单方法。

#include <math.h>
#include <vector>

double StandardDeviation(std::vector<double>);
double Variance(std::vector<double>);

int main()
{
     std::vector<double> samples;
     samples.push_back(2.0);
     samples.push_back(3.0);
     samples.push_back(4.0);
     samples.push_back(5.0);
     samples.push_back(6.0);
     samples.push_back(7.0);

     double std = StandardDeviation(samples);
     return 0;
}

double StandardDeviation(std::vector<double> samples)
{
     return sqrt(Variance(samples));
}

double Variance(std::vector<double> samples)
{
     int size = samples.size();

     double variance = 0;
     double t = samples[0];
     for (int i = 1; i < size; i++)
     {
          t += samples[i];
          double diff = ((i + 1) * samples[i]) - t;
          variance += (diff * diff) / ((i + 1.0) *i);
     }

     return variance / (size - 1);
}

回答by Caleth

Rather than writing out more loops, you can create a function objectto pass to std::accumulateto calculate the mean.

您可以创建一个函数对象来传递std::accumulate给计算平均值,而不是写出更多的循环。

template <typename T>
struct normalize {
    T operator()(T initial, T value) {
        return initial + pow(value - mean, 2);
    }
    T mean;
}

While we are at it, we can use std::istream_iteratorto do the file loading, and std::vectorbecause we don't know how many values there are at compile time. This gives us:

在此期间,我们可以使用std::istream_iterator进行文件加载,并使用std::vector因为我们不知道在编译时有多少个值。这给了我们:

int main()
{
    std::vector<int> values; // initial capacity, no contents yet

    ifstream myfile(“numbers.txt");
    if (myfile)
    {
        values.assign(std::istream_iterator<int>(myfile), {});
    }
    else { std::cout << "Error loading file" << std::endl; }

    float sum = std::accumulate(values.begin(), values.end(), 0, plus<int>()); // plus is the default for accumulate, can be omitted
    std::cout << "The sum of all integers: " << sum << std::endl;
    float mean = sum / values.size();
    std::cout << "The mean of all integers: " << mean << std::endl;
    float var = std::accumulate(values.begin(), values.end(), 0, normalize<float>{ mean }) / values.size();
    float sd = sqrt(var);
    std::cout << "The standard deviation is: " << sd << std::endl;
    return 0;
}