C++ 如何使用c ++获得线性回归线的斜率?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18939869/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to get the slope of a linear regression line using c++?
提问by godzilla
I need to attain the slope of a linear regression similar to the way the Excel function in the below link is implemented:
我需要获得类似于以下链接中 Excel 函数的实现方式的线性回归的斜率:
http://office.microsoft.com/en-gb/excel-help/slope-function-HP010342903.aspx
Is there a library in C++ or a simple coded solution someone has created which can do this?
是否有 C++ 库或某人创建的简单编码解决方案可以做到这一点?
I have implemented code according to this formula, however it does not always give me the correct results (taken from here http://easycalculation.com/statistics/learn-regression.php) ....
我已经根据这个公式实现了代码,但是它并不总是给我正确的结果(取自这里http://easycalculation.com/statistics/learn-regression.php)....
Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2)
= ((5)*(1159.7)-(311)*(18.6))/((5)*(19359)-(311)2)
= (5798.5 - 5784.6)/(96795 - 96721)
= 13.9/74
= 0.19
If I try it against the following vectors, I get the wrong results (I should be expecting 0.305556): x = 6,5,11,7,5,4,4 y = 2,3,9,1,8,7,5
如果我针对以下向量尝试它,我会得到错误的结果(我应该期待 0.305556):x = 6,5,11,7,5,4,4 y = 2,3,9,1,8,7 ,5
Thanks in advance.
提前致谢。
回答by Cassio Neri
Here is a C++11 implementation:
这是一个 C++11 实现:
#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>
double slope(const std::vector<double>& x, const std::vector<double>& y) {
const auto n = x.size();
const auto s_x = std::accumulate(x.begin(), x.end(), 0.0);
const auto s_y = std::accumulate(y.begin(), y.end(), 0.0);
const auto s_xx = std::inner_product(x.begin(), x.end(), x.begin(), 0.0);
const auto s_xy = std::inner_product(x.begin(), x.end(), y.begin(), 0.0);
const auto a = (n * s_xy - s_x * s_y) / (n * s_xx - s_x * s_x);
return a;
}
int main() {
std::vector<double> x{6, 5, 11, 7, 5, 4, 4};
std::vector<double> y{2, 3, 9, 1, 8, 7, 5};
std::cout << slope(x, y) << '\n'; // outputs 0.305556
}
You can add a test for the mathematical requirements (x.size() == y.size()
and x
is not constant) or, as the code above, assume that the user will take care of that.
您可以为数学要求添加一个测试(x.size() == y.size()
并且x
不是常数),或者,如上面的代码,假设用户会处理它。
回答by Qué Padre
Why don't you just write a simple code like this (not the best solution, for sure, just an example based on the help article):
你为什么不写一个这样的简单代码(不是最好的解决方案,当然,只是一个基于帮助文章的例子):
double slope(const vector<double>& x, const vector<double>& y){
if(x.size() != y.size()){
throw exception("...");
}
size_t n = x.size();
double avgX = accumulate(x.begin(), x.end(), 0.0) / n;
double avgY = accumulate(y.begin(), y.end(), 0.0) / n;
double numerator = 0.0;
double denominator = 0.0;
for(size_t i=0; i<n; ++i){
numerator += (x[i] - avgX) * (y[i] - avgY);
denominator += (x[i] - avgX) * (x[i] - avgX);
}
if(denominator == 0.0){
throw exception("...");
}
return numerator / denominator;
}
Note that the third argument of accumulate function must be 0.0 rather than 0, otherwise the compiler will deduct its type as int
and there are great chances that the result of accumulate calls will be wrong (it's actually wrong using MSVC2010 and mingw-w64 when passing 0 as the third parameter).
注意accumulate函数的第三个参数必须是0.0而不是0,否则编译器会推导出它的类型as,int
并且accumulate调用的结果很有可能是错误的(使用MSVC2010和mingw-w64传递0时实际上是错误的作为第三个参数)。
回答by The Quantum Physicist
The following is a templatized function I use for linear regression (fitting). It takes std::vector for data
以下是我用于线性回归(拟合)的模板化函数。它需要 std::vector 数据
template <typename T>
std::vector<T> GetLinearFit(const std::vector<T>& data)
{
T xSum = 0, ySum = 0, xxSum = 0, xySum = 0, slope, intercept;
std::vector<T> xData;
for (long i = 0; i < data.size(); i++)
{
xData.push_back(static_cast<T>(i));
}
for (long i = 0; i < data.size(); i++)
{
xSum += xData[i];
ySum += data[i];
xxSum += xData[i] * xData[i];
xySum += xData[i] * data[i];
}
slope = (data.size() * xySum - xSum * ySum) / (data.size() * xxSum - xSum * xSum);
intercept = (ySum - slope * xSum) / data.size();
std::vector<T> res;
res.push_back(slope);
res.push_back(intercept);
return res;
}
The function returns a vector with the first element being the slope, and the second element being the intercept of your linear regression.
该函数返回一个向量,其中第一个元素是斜率,第二个元素是线性回归的截距。
Example to use it:
使用示例:
std::vector<double> myData;
myData.push_back(1);
myData.push_back(3);
myData.push_back(4);
myData.push_back(2);
myData.push_back(5);
std::vector<double> linearReg = GetLinearFit(myData);
double slope = linearReg[0];
double intercept = linearReg[1];
Notice that the function presumes you have a series of numbers for your x-axis (which is what I needed). You may change that in the function if you wish.
请注意,该函数假定您的 x 轴有一系列数字(这是我需要的)。如果您愿意,您可以在函数中更改它。
回答by 3000farad
I had to create a similar function, but I needed it to handle a bunch of near-vertical slopes. I started out with Cassio Neri's code and then modified it to recalculate slopes that appear to be steeper than 1 after mirroring each point around the line x=y (which can be done easily by switching the x and y values). Then it will mirror it back and return a more accurate slope.
我必须创建一个类似的函数,但我需要它来处理一堆接近垂直的斜坡。我从 Cassio Neri 的代码开始,然后修改它以在镜像 x=y 线周围的每个点后重新计算似乎比 1 更陡的斜率(这可以通过切换 x 和 y 值轻松完成)。然后它会将其镜像回来并返回更准确的斜率。
#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>
double slope(const std::vector<double>& x, const std::vector<double>& y) {
const double n = x.size();
const double s_x = std::accumulate(x.begin(), x.end(), 0.0);
const double s_y = std::accumulate(y.begin(), y.end(), 0.0);
const double s_xx = std::inner_product(x.begin(), x.end(), x.begin(), 0.0);
const double s_xy = std::inner_product(x.begin(), x.end(), y.begin(), 0.0);
const double numer = n * s_xy - s_x * s_y; // The same regardless of inversion (both terms here are commutative)
const double denom = n * s_xx - s_x * s_x; // Will change if inverted; use this for now
double a;
if (denom == 0) a = 2; // If slope is vertical, force variable inversion calculation
else a = numer / denom;
if (std::abs(a) > 1) { // Redo with variable inversion if slope is steeper than 1
const double s_yy = std::inner_product(y.begin(), y.end(), y.begin(), 0.0);
const double new_denom = n * s_yy - s_y * s_y;
a = new_denom / numer; // Invert the fraction because we've mirrored it around x=y
}
return a;
}
int main() {
std::vector<double> x{6, 5, 11, 7, 5, 4, 4};
std::vector<double> y{2, 3, 9, 1, 8, 7, 5};
std::cout << slope(x, y) << '\n';
}