Python 在 RandomForestRegressor 中出现不支持连续错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32664717/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 12:01:17  来源:igfitidea点击:

Got continuous is not supported error in RandomForestRegressor

pythonpandasdataframescikit-learnrandom-forest

提问by toy

I'm just trying to do a simple RandomForestRegressor example. But while testing the accuracy I get this error

我只是想做一个简单的 RandomForestRegressor 示例。但是在测试准确性时我得到了这个错误

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

in accuracy_score(y_true, y_pred, normalize, sample_weight) 177 178 # Compute accuracy for each possible representation --> 179 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 180 if y_type.startswith('multilabel'): 181 differing_labels = count_nonzero(y_true - y_pred, axis=1)

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

in _check_targets(y_true, y_pred) 90 if (y_type not in ["binary", "multiclass", "multilabel-indicator", 91 "multilabel-sequences"]): ---> 92 raise ValueError("{0} is not supported".format(y_type)) 93 94 if y_type in ["binary", "multiclass"]:

ValueError: continuous is not supported
/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

inaccuracy_score(y_true, y_pred, normalize, sample_weight) 177 178 # 计算每个可能表示的准确率 --> 179 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 180 if y_type.startswith:('18labels = different') count_nonzero(y_true - y_pred, 轴 = 1)

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

in _check_targets(y_true, y_pred) 90 if (y_type not in ["binary", "multiclass", "multilabel-indicator", 91 "multilabel-sequences"]): ---> 92 raise ValueError("{0} is不支持".format(y_type)) 93 94 如果 y_type 在 ["binary", "multiclass"]:

ValueError: continuous is not supported

This is the sample of the data. I can't show the real data.

这是数据的样本。我无法显示真实数据。

target, func_1, func_2, func_2, ... func_200
float, float, float, float, ... float

Here's my code.

这是我的代码。

import pandas as pd
import numpy as np
from sklearn.preprocessing import Imputer
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import tree

train = pd.read_csv('data.txt', sep='\t')

labels = train.target
train.drop('target', axis=1, inplace=True)
cat = ['cat']
train_cat = pd.get_dummies(train[cat])

train.drop(train[cat], axis=1, inplace=True)
train = np.hstack((train, train_cat))

imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
imp.fit(train)
train = imp.transform(train)

x_train, x_test, y_train, y_test = train_test_split(train, labels.values, test_size = 0.2)

clf = RandomForestRegressor(n_estimators=10)

clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
accuracy_score(y_test, y_pred) # This is where I get the error.

采纳答案by Ibraim Ganiev

It's because accuracy_scoreis for classification tasks only. For regression you should use something different, for example:

这是因为accuracy_score仅用于分类任务。对于回归,您应该使用不同的东西,例如:

clf.score(X_test, y_test)

Where X_test is samples, y_test is corresponding ground truth values. It will compute predictions inside.

其中 X_test 是样本,y_test 是对应的地面真值。它将在内部计算预测。

回答by ThReSholD

Since you are doing a classification task, you should be using the metric R-squared(co-effecient of determination)instead of accuracy score(accuracy score is used for classification purposes).

由于您正在执行分类任务,因此您应该使用度量R 平方(确定系数)而不是 准确度分数(准确度分数用于分类目的)。

To avoid any confusion I suggest you to use different variable name like reg/rfr.

为避免混淆,我建议您使用不同的变量名称,如 reg/rfr。

R-squared can be computed by calling scorefunction provided by RandomForestRegressor, for example:

R-squared可以通过调用RandomForestRegressor提供的score函数来计算,例如:

rfr.score(X_test,Y_test)