Python/Scikit-Learn - 无法处理多类和连续的混合

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37367405/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:18:17  来源:igfitidea点击:

Python/Scikit-Learn - Can't handle mix of multiclass and continuous

pythonscikit-learn

提问by lte__

I'm trying to fit an SGDRegressor to my data and then check the accuracy. The fitting works fine, but then the predictions are not in the same datatype(?) as the original target data, and I get the error

我正在尝试将 SGDRegressor 拟合到我的数据中,然后检查准确性。拟合工作正常,但是预测与原始目标数据的数据类型不同(?),我得到了错误

ValueError: Can't handle mix of multiclass and continuous

When calling print "Accuracy:", ms.accuracy_score(y_test,predictions).

当调用print "Accuracy:", ms.accuracy_score(y_test,predictions).

The data looks like this (just 200 thousand + rows):

数据看起来像这样(只有 20 万 + 行):

Product_id/Date/product_group1/Price/Net price/Purchase price/Hour/Quantity/product_group2
0   107 12/31/2012  10  300 236 220 10  1   108

The code is as follows:

代码如下:

from sklearn.preprocessing import StandardScaler
import numpy as np
from sklearn.linear_model import SGDRegressor
import numpy as np
from sklearn import metrics as ms

msk = np.random.rand(len(beers)) < 0.8

train = beers[msk]
test = beers[~msk]

X = train [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']]
y = train[['Quantity']]
y = y.as_matrix().ravel()

X_test = test [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']]
y_test = test[['Quantity']]
y_test = y_test.as_matrix().ravel()

clf = SGDRegressor(n_iter=2000)
clf.fit(X, y)
predictions = clf.predict(X_test)
print "Accuracy:", ms.accuracy_score(y_test,predictions)

What should I do differently? Thank you!

我应该怎么做?谢谢!

回答by BrenBarn

Accuracy is a classification metric. You can't use it with a regression. See the documentationfor info on the various metrics.

准确率是一个分类指标。您不能将它与回归一起使用。有关各种指标的信息,请参阅文档

回答by Juan Jose Polanco Arias

Accuracy score is only for classification problems. For regression problems you can use: R2 Score, MSE (Mean Squared Error), RMSE (Root Mean Squared Error).

准确率分数仅适用于分类问题。对于回归问题,您可以使用:R2 分数、MSE(均方误差)、RMSE(均方根误差)。