Python 用于回归的 Scikit-learn 交叉验证评分

Question

提问by clwen

How can one use cross_val_scorefor regression? The default scoring seems to be accuracy, which is not very meaningful for regression. Supposedly I would like to use mean squared error, is it possible to specify that in cross_val_score?

如何使用cross_val_score回归？默认评分好像是accuracy，这对regression来说意义不大。据说我想使用均方误差，是否可以在cross_val_score?

Tried the following two but doesn't work:

尝试了以下两个但不起作用：

scores = cross_validation.cross_val_score(svr, diabetes.data, diabetes.target, cv=5, scoring='mean_squared_error')

and

和

scores = cross_validation.cross_val_score(svr, diabetes.data, diabetes.target, cv=5, scoring=metrics.mean_squared_error)

The first one generates a list of negative numbers while mean squared error should always be non-negative. The second one complains that:

第一个生成负数列表，而均方误差应始终为非负数。第二个抱怨说：

mean_squared_error() takes exactly 2 arguments (3 given)

Answer 1

采纳答案by Sirrah

I dont have the reputation to comment but I want to provide this link for you and/or a passersby where the negative output of the MSE in scikit learn is discussed - https://github.com/scikit-learn/scikit-learn/issues/2439

我没有评论的声誉，但我想为您和/或路人提供此链接，其中讨论了 scikit learn 中 MSE 的负面输出 - https://github.com/scikit-learn/scikit-learn/问题/2439

In addition (to make this a real answer) your first option is correct in that not only is MSE the metric you want to use to compare models but R^2 cannot be calculated depending (I think) on the type of cross-val you are using.

此外（为了使其成为真正的答案）您的第一个选项是正确的，因为 MSE 不仅是您想要用来比较模型的指标，而且 R^2 不能根据（我认为）您的交叉验证类型来计算正在使用。

If you choose MSE as a scorer, it outputs a list of errors which you can then take the mean of, like so:

如果您选择 MSE 作为记分员，它会输出一个错误列表，然后您可以取其平均值，如下所示：

# Doing linear regression with leave one out cross val

from sklearn import cross_validation, linear_model
import numpy as np

# Including this to remind you that it is necessary to use numpy arrays rather 
# than lists otherwise you will get an error
X_digits = np.array(x)
Y_digits = np.array(y)

loo = cross_validation.LeaveOneOut(len(Y_digits))

regr = linear_model.LinearRegression()

scores = cross_validation.cross_val_score(regr, X_digits, Y_digits, scoring='mean_squared_error', cv=loo,)

# This will print the mean of the list of errors that were output and 
# provide your metric for evaluation
print scores.mean()

Answer 2

回答by Andreas Mueller

The first one is correct. It outputs the negative of the MSE, as it always tries to maximize the score. Please help us by suggesting an improvement to the documentation.

第一个是正确的。它输出 MSE 的负值，因为它总是试图最大化分数。请通过建议改进文档来帮助我们。

Python 用于回归的 Scikit-learn 交叉验证评分

提问by clwen

采纳答案by Sirrah

回答by Andreas Mueller

相关推荐

最近更新

标签

Python 用于回归的 Scikit-learn 交叉验证评分

提问by clwen

采纳答案by Sirrah

回答by Andreas Mueller

相关推荐

Python pip 因 AttributeError 失败：'module' 对象没有属性 'wraps'

Python matplotlib 轴标签的奇怪错误

如何在 python 中使用 sys.argv 检查参数的长度，以便它可以作为脚本运行？

Python Sklearn，gridsearch：如何在执行过程中打印出进度？

相关推荐

最近更新

标签