Python Scikit学习中的R2值是如何计算的？

Question

提问by joeally

The R^2 value returned by scikit learn (metrics.r2_score()) can be negative. The docssay:

scikit learn ( metrics.r2_score())返回的 R^2 值可以为负数。该文件说：

"Unlike most other scores, R2 score may be negative (it need not actually be the square of a quantity R)."

“与大多数其他分数不同，R2 分数可能为负（它实际上不必是数量 R 的平方）。”

However the wikipedia articleon R^2 mentions no R (not squared) quantity. Perhaps it uses absolute differences instead of square differences. I really have no idea

然而，关于 R^2的维基百科文章没有提到 R（非平方）数量。也许它使用绝对差异而不是平方差异。我真的不知道

Answer 1

采纳答案by eickenberg

The R^2in scikit learn is essentially the same as what is described in the wikipedia article on the coefficient of determination(grep for "the most general definition"). It is 1 - residual sum of square / total sum of squares.

将R^2在scikit学习的是基本相同什么是描述维基百科文章的决定系数（grep命令“最普遍的定义”）。它是1 - residual sum of square / total sum of squares。

The big difference between a classical stats setting and what you usually try to do with machine learning, is that in machine learning you evaluate your score on unseen data, which can lead to results outside [0,1]. If you apply R^2to the same data you used to fit your model, it will lie within [0, 1]

经典统计数据设置与您通常尝试使用机器学习进行的操作之间的最大区别在于，在机器学习中，您会根据看不见的数据评估您的分数，这可能会导致结果超出[0,1]. 如果您应用R^2用于拟合模型的相同数据，它将位于[0, 1]

回答by ManiS

Since R^2 = 1 - RSS/TSS, the only case where RSS/TSS > 1 happens when our model is even worse than the worst model assumed (which is the absolute mean model).

由于 R^2 = 1 - RSS/TSS，只有当我们的模型比假设的最差模型（即绝对平均模型）更差时，才会发生 RSS/TSS > 1 的情况。

here RSS = sum of squares of difference between actual values(yi) and predicted values(yi^) and TSS = sum of squares of difference between actual values (yi) and mean value (Before applying Regression). So you can imagine TSS representing the best(actual) model, and RSS being in between our best model and the worst absolute mean model in which case we'll get RSS/TSS < 1. If our model is even worse than the worst mean model then in that case RSS > TSS(Since difference between actual observation and mean value < difference predicted value and actual observation).

这里 RSS = 实际值 (yi) 和预测值 (yi^) 之间的差异平方和和 TSS = 实际值 (yi) 和平均值之间的差异平方和（应用回归之前）。所以你可以想象 TSS 代表最好的（实际）模型，而 RSS 介于我们最好的模型和最差的绝对平均模型之间，在这种情况下，我们将得到 RSS/TSS < 1。如果我们的模型比最坏的平均数更差模型然后在这种情况下 RSS > TSS（因为实际观察值和平均值之间的差异 < 预测值和实际观察值之间的差异）。

Check here for better intuition with visual representation: https://ragrawal.wordpress.com/2017/05/06/intuition-behind-r2-and-other-regression-evaluation-metrics/

在这里查看视觉表现的更好直觉：https: //ragrawal.wordpress.com/2017/05/06/intuition-behind-r2-and-other-regression-evaluation-metrics/

Python Scikit学习中的R2值是如何计算的？

提问by joeally

采纳答案by eickenberg

回答by ManiS

相关推荐

最近更新

标签

Python Scikit学习中的R2值是如何计算的？

提问by joeally

采纳答案by eickenberg

回答by ManiS

相关推荐

Python django-cors-headers 不起作用

Python SQLalchemy 找不到用于创建外键的表

Python 要求用户输入直到他们给出有效的响应

Python 无法导入名称 simplejson - 安装 simplejson 后

相关推荐

最近更新

标签