Python ValueError: 数据不是二进制的并且未指定 pos_label

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18401112/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:38:57  来源:igfitidea点击:

ValueError: Data is not binary and pos_label is not specified

pythonscikit-learnroc

提问by

I am trying to calculate roc_auc_score, but I am getting following error.

我正在尝试计算roc_auc_score,但出现以下错误。

"ValueError: Data is not binary and pos_label is not specified"

My code snippet is as follows:

我的代码片段如下:

import numpy as np
from sklearn.metrics import roc_auc_score
y_scores=np.array([ 0.63, 0.53, 0.36, 0.02, 0.70 ,1 , 0.48, 0.46, 0.57])
y_true=np.array(['0', '1', '0', '0', '1', '1', '1', '1', '1'])
roc_auc_score(y_true, y_scores)

Please tell me what is wrong with it.

请告诉我它有什么问题。

采纳答案by jabaldonedo

You only need to change y_trueso it looks like this:

你只需要改变y_true它看起来像这样:

y_true=np.array([0, 1, 0, 0, 1, 1, 1, 1, 1])

Explanation:If you take a look to what roc_auc_scorefunctions does in https://github.com/scikit-learn/scikit-learn/blob/0.15.X/sklearn/metrics/metrics.pyyou will see that y_trueis evaluated as follows:

说明:如果您查看https://github.com/scikit-learn/scikit-learn/blob/0.15.X/sklearn/metrics/metrics.py 中的roc_auc_score函数的作用,您将看到其评估如下:y_true

classes = np.unique(y_true)
if (pos_label is None and not (np.all(classes == [0, 1]) or
 np.all(classes == [-1, 1]) or
 np.all(classes == [0]) or
 np.all(classes == [-1]) or
 np.all(classes == [1]))):
    raise ValueError("Data is not binary and pos_label is not specified")

At the moment of the execution pos_labelis None, but as long as your are defining y_trueas an array of characters the np.allare always falseand as all of them are negated then the if condition is trueand the exception is raised.

在执行的那一刻pos_labelis None,但只要您定义y_true为字符数组,np.allare alwaysfalse并且所有这些都被否定,则 if 条件为,true并引发异常。

回答by mAge

We have problem in y_true=np.array(['0', '1', '0', '0', '1', '1', '1', '1', '1'])Convert values of y_true to Boolean

我们在y_true=np.array(['0', '1', '0', '0', '1', '1', '1', '1', '1'])将 y_true 的值转换为布尔值时遇到问题

y_true= '1' <= y_true
print(y_true) # [False  True False False  True  True  True  True  True]