Python 如何计算 keras 中的接收操作特性 (ROC) 和 AUC?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41032551/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 00:22:38  来源:igfitidea点击:

How to compute Receiving Operating Characteristic (ROC) and AUC in keras?

pythontheanokeras

提问by Eka

I have a multi output(200) binary classification model which I wrote in keras.

我有一个用 keras 编写的多输出(200)二元分类模型。

In this model I want to add additional metrics such as ROC and AUC but to my knowledge keras dosen't have in-built ROC and AUC metric functions.

在这个模型中,我想添加额外的指标,例如 ROC 和 AUC,但据我所知,keras 没有内置的 ROC 和 AUC 指标函数。

I tried to import ROC, AUC functions from scikit-learn

我试图从 scikit-learn 导入 ROC、AUC 函数

from sklearn.metrics import roc_curve, auc
from keras.models import Sequential
from keras.layers import Dense
.
.
.
model.add(Dense(200, activation='relu'))
model.add(Dense(300, activation='relu'))
model.add(Dense(400, activation='relu'))
model.add(Dense(300, activation='relu'))
model.add(Dense(200,init='normal', activation='softmax')) #outputlayer

model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy','roc_curve','auc'])

but it's giving this error:

但它给出了这个错误:

Exception: Invalid metric: roc_curve

例外:无效的指标:roc_curve

How should I add ROC, AUC to keras?

我应该如何将 ROC、AUC 添加到 keras?

回答by Tom

Due to that you can't calculate ROC&AUC by mini-batches, you can only calculate it on the end of one epoch. There is a solution from jamartinh, I patch the codes below for convenience:

由于无法通过 mini-batches 计算 ROC&AUC,只能在一个 epoch 结束时计算。jamartinh 提供了一个解决方案,为方便起见,我修补了以下代码:

from sklearn.metrics import roc_auc_score
from keras.callbacks import Callback
class RocCallback(Callback):
    def __init__(self,training_data,validation_data):
        self.x = training_data[0]
        self.y = training_data[1]
        self.x_val = validation_data[0]
        self.y_val = validation_data[1]


    def on_train_begin(self, logs={}):
        return

    def on_train_end(self, logs={}):
        return

    def on_epoch_begin(self, epoch, logs={}):
        return

    def on_epoch_end(self, epoch, logs={}):
        y_pred_train = self.model.predict_proba(self.x)
        roc_train = roc_auc_score(self.y, y_pred_train)
        y_pred_val = self.model.predict_proba(self.x_val)
        roc_val = roc_auc_score(self.y_val, y_pred_val)
        print('\rroc-auc_train: %s - roc-auc_val: %s' % (str(round(roc_train,4)),str(round(roc_val,4))),end=100*' '+'\n')
        return

    def on_batch_begin(self, batch, logs={}):
        return

    def on_batch_end(self, batch, logs={}):
        return

roc = RocCallback(training_data=(X_train, y_train),
                  validation_data=(X_test, y_test))

model.fit(X_train, y_train, 
          validation_data=(X_test, y_test),
          callbacks=[roc])

A more hackable way using tf.contrib.metrics.streaming_auc:

一种更容易破解的方法tf.contrib.metrics.streaming_auc

import numpy as np
import tensorflow as tf
from sklearn.metrics import roc_auc_score
from sklearn.datasets import make_classification
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.callbacks import Callback, EarlyStopping


# define roc_callback, inspired by https://github.com/keras-team/keras/issues/6050#issuecomment-329996505
def auc_roc(y_true, y_pred):
    # any tensorflow metric
    value, update_op = tf.contrib.metrics.streaming_auc(y_pred, y_true)

    # find all variables created for this metric
    metric_vars = [i for i in tf.local_variables() if 'auc_roc' in i.name.split('/')[1]]

    # Add metric variables to GLOBAL_VARIABLES collection.
    # They will be initialized for new session.
    for v in metric_vars:
        tf.add_to_collection(tf.GraphKeys.GLOBAL_VARIABLES, v)

    # force to update metric values
    with tf.control_dependencies([update_op]):
        value = tf.identity(value)
        return value

# generation a small dataset
N_all = 10000
N_tr = int(0.7 * N_all)
N_te = N_all - N_tr
X, y = make_classification(n_samples=N_all, n_features=20, n_classes=2)
y = np_utils.to_categorical(y, num_classes=2)

X_train, X_valid = X[:N_tr, :], X[N_tr:, :]
y_train, y_valid = y[:N_tr, :], y[N_tr:, :]

# model & train
model = Sequential()
model.add(Dense(2, activation="softmax", input_shape=(X.shape[1],)))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy', auc_roc])

my_callbacks = [EarlyStopping(monitor='auc_roc', patience=300, verbose=1, mode='max')]

model.fit(X, y,
          validation_split=0.3,
          shuffle=True,
          batch_size=32, nb_epoch=5, verbose=1,
          callbacks=my_callbacks)

# # or use independent valid set
# model.fit(X_train, y_train,
#           validation_data=(X_valid, y_valid),
#           batch_size=32, nb_epoch=5, verbose=1,
#           callbacks=my_callbacks)

回答by Kimball Hill

Like you, I prefer using scikit-learn's built in methods to evaluate AUROC. I find that the best and easiest way to do this in keras is to create a custom metric. If tensorflow is your backend, implementing this can be done in very few lines of code:

和您一样,我更喜欢使用 scikit-learn 的内置方法来评估 AUROC。我发现在 keras 中最好和最简单的方法是创建一个自定义指标。如果 tensorflow 是您的后端,那么只需几行代码即可实现:

import tensorflow as tf
from sklearn.metrics import roc_auc_score

def auroc(y_true, y_pred):
    return tf.py_func(roc_auc_score, (y_true, y_pred), tf.double)

# Build Model...

model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy', auroc])

Creating a custom Callback as mentioned in other answers will not work for your case since your model has multiple ouputs, but this will work. Additionally, this methods allows the metric to be evaluated on both training and validation data whereas a keras callback does not have access to the training data and can thus only be used to evaluate performance on the training data.

由于您的模型有多个输出,因此创建其他答案中提到的自定义回调不适用于您的案例,但这会起作用。此外,这种方法允许在训练和验证数据上评估指标,而 keras 回调无法访问训练数据,因此只能用于评估训练数据的性能。

回答by B. Kanani

The following solution worked for me:

以下解决方案对我有用:

import tensorflow as tf
from keras import backend as K

def auc(y_true, y_pred):
    auc = tf.metrics.auc(y_true, y_pred)[1]
    K.get_session().run(tf.local_variables_initializer())
    return auc

model.compile(loss="binary_crossentropy", optimizer='adam', metrics=[auc])

回答by Eka

I solved my problem this way

我这样解决了我的问题

consider you have testing dataset x_testfor features and y_testfor its corresponding targets.

考虑您有用于特征的测试数据集x_test和用于其相应目标的y_test

first we predict targets from feature using our trained model

首先,我们使用经过训练的模型从特征中预测目标

 y_pred = model.predict_proba(x_test)

then from sklearn we import roc_auc_scorefunction and then simple pass the original targets and predicted targets to the function.

然后从 sklearn 我们导入roc_auc_score函数,然后简单地将原始目标和预测目标传递给该函数。

 roc_auc_score(y_test, y_pred)

回答by sunil manikani

'roc_curve','auc' are not standard metrics you can't pass them like that to metrics variable, this is not allowed. You can pass something like 'fmeasure' which is a standard metric.

'roc_curve','auc' 不是标准指标,您不能像那样将它们传递给指标变量,这是不允许的。您可以传递诸如“fmeasure”之类的标准指标。

Review the available metrics here: https://keras.io/metrics/You may also want to have a look at making your own custom metric: https://keras.io/metrics/#custom-metrics

在此处查看可用指标:https: //keras.io/metrics/您可能还想看看制作自己的自定义指标:https: //keras.io/metrics/#custom-metrics

Also have a look at generate_results method mentioned in this blog for ROC, AUC... https://vkolachalama.blogspot.in/2016/05/keras-implementation-of-mlp-neural.html

还可以查看本博客中提到的 ROC、AUC 的 generate_results 方法... https://vkolachalama.blogspot.in/2016/05/keras-implementation-of-mlp-neural.html

回答by KarthikS

Adding to above answers, I got the error "ValueError: bad input shape ...", so I specify the vector of probabilities as follows:

添加到上述答案中,我收到错误“ValueError: bad input shape ...”,所以我指定概率向量如下:

y_pred = model.predict_proba(x_test)[:,1]
auc = roc_auc_score(y_test, y_pred)
print(auc)

回答by 0-_-0

You can monitor auc during training by providing metrics the following way:

您可以通过以下方式提供指标来在训练期间监控 auc:

METRICS = [
      keras.metrics.TruePositives(name='tp'),
      keras.metrics.FalsePositives(name='fp'),
      keras.metrics.TrueNegatives(name='tn'),
      keras.metrics.FalseNegatives(name='fn'), 
      keras.metrics.BinaryAccuracy(name='accuracy'),
      keras.metrics.Precision(name='precision'),
      keras.metrics.Recall(name='recall'),
      keras.metrics.AUC(name='auc'),
]

def make_model(metrics = METRICS, output_bias=None):
  if output_bias is not None:
    output_bias = tf.keras.initializers.Constant(output_bias)
  model = keras.Sequential([
      keras.layers.Dense(
          16, activation='relu',
          input_shape=(train_features.shape[-1],)),
      keras.layers.Dropout(0.5),
      keras.layers.Dense(1, activation='sigmoid',
                         bias_initializer=output_bias),
  ])

  model.compile(
      optimizer=keras.optimizers.Adam(lr=1e-3),
      loss=keras.losses.BinaryCrossentropy(),
      metrics=metrics)

  return model


for a more detailed tutorial see:
https://www.tensorflow.org/tutorials/structured_data/imbalanced_data

有关更详细的教程,请参阅:https:
//www.tensorflow.org/tutorials/structured_data/imbalanced_data