pandas ValueError：使用 sklearn roc_auc_score 函数不支持多类多输出格式

Question

提问by stone rock

I am using logistic regressionfor prediction. My predictions are 0'sand 1's. After training my model on given data and also when training on important features i.e X_important_trainsee screenshot. I am getting score around 70% but when I use roc_auc_score(X,y)or roc_auc_score(X_important_train, y_train)I am getting value error: ValueError: multiclass-multioutput format is not supported

我logistic regression用于预测。我的预测是0's和1's。在给定数据上训练我的模型之后，以及在训练重要特征时，即X_important_train见截图。我得到大约 70% 的分数但是当我使用roc_auc_score(X,y)或roc_auc_score(X_important_train, y_train)我得到值错误时： ValueError: multiclass-multioutput format is not supported

Code:

代码：

# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score

# Standarize features
scaler = StandardScaler()
X_std = scaler.fit_transform(X)

# Train the model using the training sets and check score
model.fit(X, y)
model.score(X, y)

model.fit(X_important_train, y_train)
model.score(X_important_train, y_train)

roc_auc_score(X_important_train, y_train)

Screenshot:

截屏：

Answer 1

采纳答案by seralouk

First of all, the roc_auc_scorefunction expects input arguments with the same shape.

首先，该roc_auc_score函数需要具有相同形状的输入参数。

sklearn.metrics.roc_auc_score(y_true, y_score, average='macro', sample_weight=None)

Note: this implementation is restricted to the binary classification task or multilabel classification task in label indicator format.

y_true : array, shape = [n_samples] or [n_samples, n_classes]
True binary labels in binary label indicators.

y_score : array, shape = [n_samples] or [n_samples, n_classes]
Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).

Now, the inputs are the true and predicted scores, NOT the training and label data as you are using in the example that you posted.In more detail,

现在，输入是真实分数和预测分数，而不是您在发布的示例中使用的训练和标签数据。更详细地说，

model.fit(X_important_train, y_train)
model.score(X_important_train, y_train)
# this is wrong here
roc_auc_score(X_important_train, y_train)

You should so something like:

你应该是这样的：

y_pred = model.predict(X_test_data)
roc_auc_score(y_true, y_pred)

pandas ValueError：使用 sklearn roc_auc_score 函数不支持多类多输出格式

提问by stone rock

采纳答案by seralouk

相关推荐

最近更新

标签

pandas ValueError：使用 sklearn roc_auc_score 函数不支持多类多输出格式

提问by stone rock

采纳答案by seralouk

相关推荐

pandas 如何使用pandas python获取数据框中每列的最大长度

将多个 csv 文件读入 Pandas 数据帧

pandas 将熊猫数据帧转换为 json 对象 - 熊猫

pandas Python：从数据透视表熊猫数据框创建条形图

相关推荐

最近更新

标签