Python 如何获得分类器在 sklearn 中进行预测的置信度?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31129592/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:31:34  来源:igfitidea点击:

How to get a classifier's confidence score for a prediction in sklearn?

pythonmachine-learningscikit-learnprobabilityprediction

提问by user3377126

I would like to get a confidence score of each of the predictions that it makes, showing on how sure the classifier is on its prediction that it is correct.

我想获得它所做的每个预测的置信度分数,以显示分类器对其正确性的预测有多确定。

I want something like this:

我想要这样的东西:

How sure is the classifier on its prediction?

分类器的预测有多确定?

Class 1: 81% that this is class 1
Class 2: 10%
Class 3: 6%
Class 4: 3%

Class 1: 81% 这是Class 1
Class 2: 10%
Class 3: 6%
Class 4: 3%

Samples of my code:

我的代码示例:

features_train, features_test, labels_train, labels_test = cross_validation.train_test_split(main, target, test_size = 0.4)

# Determine amount of time to train
t0 = time()
model = SVC()
#model = SVC(kernel='poly')
#model = GaussianNB()

model.fit(features_train, labels_train)

print 'training time: ', round(time()-t0, 3), 's'

# Determine amount of time to predict
t1 = time()
pred = model.predict(features_test)

print 'predicting time: ', round(time()-t1, 3), 's'

accuracy = accuracy_score(labels_test, pred)

print 'Confusion Matrix: '
print confusion_matrix(labels_test, pred)

# Accuracy in the 0.9333, 9.6667, 1.0 range
print accuracy



model.predict(sub_main)

# Determine amount of time to predict
t1 = time()
pred = model.predict(sub_main)

print 'predicting time: ', round(time()-t1, 3), 's'

print ''
print 'Prediction: '
print pred

I suspect that I would use the score() function, but I seem to keep implementing it correctly. I don't know if that's the right function or not, but how would one get the confidence percentage of a classifier's prediction?

我怀疑我会使用 score() 函数,但我似乎一直在正确地实现它。我不知道这是否是正确的函数,但是如何获得分类器预测的置信百分比?

采纳答案by Justin Peel

Per the SVC documentation, it looks like you need to change how you construct the SVC:

根据SVC 文档,您似乎需要更改构建 SVC 的方式:

model = SVC(probability=True)

and then use the predict_proba method:

然后使用 predict_proba 方法:

class_probabilities = model.predict_proba(sub_main)

回答by Jianxun Li

For those estimators implementing predict_proba()method, like Justin Peel suggested, You can just use predict_proba()to produce probability on your prediction.

对于那些实施predict_proba()方法的估计器,就像 Justin Peel 建议的那样,您可以使用它predict_proba()来产生预测的概率。

For those estimators which do not implement predict_proba()method, you can construct confidence interval by yourself using bootstrap concept (repeatedly calculate your point estimates in many sub-samples).

对于那些没有实现predict_proba()方法的估计量,您可以使用 bootstrap 概念自行构建置信区间(在许多子样本中重复计算您的点估计)。

Let me know if you need any detailed examples to demonstrate either of these two cases.

如果您需要任何详细示例来演示这两种情况,请告诉我。