Python scikit-learn 交叉验证,均方误差为负值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21443865/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
scikit-learn cross validation, negative values with mean squared error
提问by ahmethungari
When I use the following code with Data matrix Xof size (952,144) and output vector yof size (952), mean_squared_errormetric returns negative values, which is unexpected. Do you have any idea?
当我将以下代码与X大小为 (952,144) 的数据矩阵和大小y为 (952) 的输出向量一起使用时,mean_squared_error度量返回负值,这是出乎意料的。你有什么主意吗?
from sklearn.svm import SVR
from sklearn import cross_validation as CV
reg = SVR(C=1., epsilon=0.1, kernel='rbf')
scores = CV.cross_val_score(reg, X, y, cv=10, scoring='mean_squared_error')
all values in scoresare then negative.
所有的值scores都是负的。
采纳答案by AN6U5
Trying to close this out, so am providing the answer that David and larsmans have eloquently described in the comments section:
试图结束这一点,所以我提供了大卫和拉斯曼斯在评论部分雄辩地描述的答案:
Yes, this is supposed to happen. The actual MSE is simply the positive version of the number you're getting.
是的,这应该发生。实际 MSE 只是您获得的数字的正数。
The unified scoring API always maximizes the score, so scores which need to be minimized are negated in order for the unified scoring API to work correctly. The score that is returned is therefore negated when it is a score that should be minimized and left positive if it is a score that should be maximized.
统一评分 API 总是最大化分数,因此需要最小化的分数被否定,以便统一评分 API 正常工作。因此,当它是应该最小化的分数时,返回的分数被否定,如果它是应该最大化的分数,则保留为正数。
This is also described in sklearn GridSearchCV with Pipeline.
回答by Otacílio Neto
You can fix it by changing scoring method to "neg_mean_squared_error" as you can see below:
您可以通过将评分方法更改为“neg_mean_squared_error”来修复它,如下所示:
from sklearn.svm import SVR
from sklearn import cross_validation as CV
reg = SVR(C=1., epsilon=0.1, kernel='rbf')
scores = CV.cross_val_score(reg, X, y, cv=10, scoring='neg_mean_squared_error')

