在 C++ 中计算标准偏差和方差
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33268513/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Calculating Standard Deviation & Variance in C++
提问by Hyman
so i've posted a few times and previously my problems were pretty vague
所以我已经发布了几次,以前我的问题很模糊
i started C++ this week and have been doing a little project
我这周开始使用 C++ 并且一直在做一个小项目
so i'm trying to calc standard deviation & variance
所以我正在尝试计算标准偏差和方差
my code loads a file of 100 integers and put them into an array, counts them, calcs mean, sum, var and sd
我的代码加载了一个包含 100 个整数的文件并将它们放入一个数组中,对它们进行计数,计算均值、总和、var 和 sd
but i'm having a little trouble with variance
但我在方差方面遇到了一些麻烦
i keep getting a huge number - i have a feeling it's to do with its calculation
我一直得到一个巨大的数字 - 我觉得这与它的计算有关
my mean and sum are ok
我的平均值和总和没问题
any help or tips?
任何帮助或提示?
NB:
注意:
Cheers,
干杯,
Hyman
Hyman
using namespace std;
int main()
{
int n = 0;
int Array[100];
float mean;
float var;
float sd;
string line;
float numPoints;
ifstream myfile(“numbers.txt");
if (myfile.is_open())
{
while (!myfile.eof())
{
getline(myfile, line);
stringstream convert(line);
if (!(convert >> Array[n]))
{
Array[n] = 0;
}
cout << Array[n] << endl;
n++;
}
myfile.close();
numPoints = n;
}
else cout<< "Error loading file" <<endl;
int sum = accumulate(begin(Array), end(Array), 0, plus<int>());
cout << "The sum of all integers: " << sum << endl;
mean = sum/numPoints;
cout << "The mean of all integers: " << mean <<endl;
var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;
sd = sqrt(var);
cout << "The standard deviation is: " << sd <<endl;
return 0;
}
回答by Ahmed Akhtar
As the other answer by horseshoe correctly suggests, you will have to use a loop to calculate variance otherwise the statement
正如马蹄铁的另一个答案正确暗示的那样,您必须使用循环来计算方差,否则语句
var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;
var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;
will just consider a single element from the array.
将只考虑数组中的一个元素。
Just improved horseshoe's suggested code:
刚刚改进了马蹄铁的建议代码:
var = 0;
for( n = 0; n < numPoints; n++ )
{
var += (Array[n] - mean) * (Array[n] - mean);
}
var /= numPoints;
sd = sqrt(var);
Your sum works fine even without using loop because you are using accumulatefunction which already has a loop inside it, but which is not evident in the code, take a look at the equivalent behavior of accumulatefor a clear understanding of what it is doing.
即使不使用循环,您的 sum 也能正常工作,因为您使用的累积函数内部已经有一个循环,但在代码中并不明显,请查看累积的等效行为,以便清楚地了解它在做什么。
Note:X ?= Y
is short for X = X ? Y
where ?
can be any operator.
Also you can use pow(Array[n] - mean, 2)
to take the square instead of multiplying it by itself making it more tidy.
注意:X ?= Y
是X = X ? Y
where?
可以是任何运算符的缩写。您也可以使用pow(Array[n] - mean, 2)
正方形而不是将其乘以本身使其更整洁。
回答by rayryeng
Here's another approach using std::accumulate
but without using pow
. In addition, we can use an anonymous function to define how to calculate the variance after we calculate the mean. Note that this computes the unbiased sample variance.
这是使用std::accumulate
但不使用pow
. 另外,我们可以使用匿名函数来定义我们计算均值后如何计算方差。请注意,这会计算无偏样本方差。
#include <vector>
#include <algorithm>
#include <numeric>
template<typename T>
T variance(const std::vector<T> &vec)
{
size_t sz = vec.size();
if (sz == 1)
return 0.0;
// Calculate the mean
T mean = std::accumulate(vec.begin(), vec.end(), 0.0) / sz;
// Now calculate the variance
auto variance_func = [&mean, &sz](T accumulator, const T& val)
{
return accumulator + ((val - mean)*(val - mean) / (sz - 1));
};
return std::accumulate(vec.begin(), vec.end(), 0.0, variance_func);
}
A sample of how to use this function:
如何使用此功能的示例:
int main()
{
std::vector<double> vec = {1.0, 5.0, 6.0, 3.0, 4.5};
std::cout << variance(vec) << std::endl;
}
回答by horseshoe
Your variance calculation is outside the loop and thus it is only based on the n== 100 value.
You need an additional loop.
您的方差计算在循环之外,因此它仅基于n== 100 value.
您需要额外的循环。
You need:
你需要:
var = 0;
n=0;
while (n<numPoints){
var = var + ((Array[n] - mean) * (Array[n] - mean));
n++;
}
var /= numPoints;
sd = sqrt(var);
回答by D.Zadravec
Two simple methods to calculate Standard Deviation & Variance in C++.
在 C++ 中计算标准偏差和方差的两种简单方法。
#include <math.h>
#include <vector>
double StandardDeviation(std::vector<double>);
double Variance(std::vector<double>);
int main()
{
std::vector<double> samples;
samples.push_back(2.0);
samples.push_back(3.0);
samples.push_back(4.0);
samples.push_back(5.0);
samples.push_back(6.0);
samples.push_back(7.0);
double std = StandardDeviation(samples);
return 0;
}
double StandardDeviation(std::vector<double> samples)
{
return sqrt(Variance(samples));
}
double Variance(std::vector<double> samples)
{
int size = samples.size();
double variance = 0;
double t = samples[0];
for (int i = 1; i < size; i++)
{
t += samples[i];
double diff = ((i + 1) * samples[i]) - t;
variance += (diff * diff) / ((i + 1.0) *i);
}
return variance / (size - 1);
}
回答by Caleth
Rather than writing out more loops, you can create a function objectto pass to std::accumulate
to calculate the mean.
您可以创建一个函数对象来传递std::accumulate
给计算平均值,而不是写出更多的循环。
template <typename T>
struct normalize {
T operator()(T initial, T value) {
return initial + pow(value - mean, 2);
}
T mean;
}
While we are at it, we can use std::istream_iteratorto do the file loading, and std::vectorbecause we don't know how many values there are at compile time. This gives us:
在此期间,我们可以使用std::istream_iterator进行文件加载,并使用std::vector因为我们不知道在编译时有多少个值。这给了我们:
int main()
{
std::vector<int> values; // initial capacity, no contents yet
ifstream myfile(“numbers.txt");
if (myfile)
{
values.assign(std::istream_iterator<int>(myfile), {});
}
else { std::cout << "Error loading file" << std::endl; }
float sum = std::accumulate(values.begin(), values.end(), 0, plus<int>()); // plus is the default for accumulate, can be omitted
std::cout << "The sum of all integers: " << sum << std::endl;
float mean = sum / values.size();
std::cout << "The mean of all integers: " << mean << std::endl;
float var = std::accumulate(values.begin(), values.end(), 0, normalize<float>{ mean }) / values.size();
float sd = sqrt(var);
std::cout << "The standard deviation is: " << sd << std::endl;
return 0;
}