Python Scikit 学习 SVC 决策函数和预测

Question

提问by Peter Tseng

I'm trying to understand the relationship between decision_function and predict, which are instance methods of SVC (http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html). So far I've gathered that decision function returns pairwise scores between classes. I was under the impression that predict chooses the class that maximizes its pairwise score, but I tested this out and got different results. Here's the code I was using to try and understand the relationship between the two. First I generated the pairwise score matrix, and then I printed out the class that has maximal pairwise score which was different than the class predicted by clf.predict.

我试图了解decision_function 和predict 之间的关系，它们是SVC 的实例方法（http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html）。到目前为止，我已经收集到决策函数返回类之间的成对分数。我的印象是 predict 选择最大化其成对分数的类，但我对此进行了测试并得到了不同的结果。这是我用来尝试理解两者之间关系的代码。首先我生成成对分数矩阵，然后我打印出具有最大成对分数的类，该类与 clf.predict 预测的类不同。

        result = clf.decision_function(vector)[0]
        counter = 0
        num_classes = len(clf.classes_)
        pairwise_scores = np.zeros((num_classes, num_classes))
        for r in xrange(num_classes):
            for j in xrange(r + 1, num_classes):
                pairwise_scores[r][j] = result[counter]
                pairwise_scores[j][r] = -result[counter]
                counter += 1

        index = np.argmax(pairwise_scores)
        class = index_star / num_classes
        print class
        print clf.predict(vector)[0]

Does anyone know the relationship between these predict and decision_function?

有谁知道这些预测和决策函数之间的关系？

Answer 1

回答by Martin B?schen

I don't fully understand your code, but let's go trough the example of the documentation page you referenced:

我不完全理解你的代码，但让我们来看看你引用的文档页面的例子：

import numpy as np
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
y = np.array([1, 1, 2, 2])
from sklearn.svm import SVC
clf = SVC()
clf.fit(X, y)

Now let's apply both the decision function and predict to the samples:

现在让我们将决策函数和预测应用于样本：

clf.decision_function(X)
clf.predict(X)

The output we get is:

我们得到的输出是：

array([[-1.00052254],
       [-1.00006594],
       [ 1.00029424],
       [ 1.00029424]])
array([1, 1, 2, 2])

And that is easy to interpret: The desion function tells us on which side of the hyperplane generated by the classifier we are (and how far we are away from it). Based on that information, the estimator then label the examples with the corresponding label.

这很容易解释：desion 函数告诉我们我们在分类器生成的超平面的哪一侧（以及我们离它多远）。根据该信息，估计器然后用相应的标签标记示例。

Answer 2

回答by Bilal Dadanlar

They probably have a bit complicated mathematical relation. But if you use the decision_functionin LinearSVCclassifier, the relation between those two will be more clear! Because then decision_functionwill give you scores for each class label (not same as SVC) and predict will give the class with the best score.

他们可能有一些复杂的数学关系。但是如果你使用decision_functioninLinearSVC分类器，这两者之间的关系会更加清晰！因为 thendecision_function会给你每个班级标签的分数（与 SVC 不同），而 predict 会给你最好的班级。

Answer 3

回答by RomanCobra

When you call decision_function(), you get the output from each of the pairwise classifiers (n*(n-1)/2 numbers total). See pages 127 and 128 of "Support Vector Machines for Pattern Classification".

当您调用时decision_function()，您将获得每个成对分类器的输出（总共 n*(n-1)/2 个数字）。请参阅“用于模式分类的支持向量机”的第 127 和 128 页。

Each classifier puts in a vote as to what the correct answer is (based on the sign of the output of that classifier); predict()returns the class with the most votes.

每个分类器对正确答案进行投票（基于该分类器输出的符号）；predict()返回投票最多的班级。

Answer 4

回答by bcorso

For those interested, I'll post a quick example of the predictfunction translated from C++ (here) to python:

对于那些感兴趣的人，我将发布一个predict从 C++（此处）转换为 python的函数的快速示例：

# I've only implemented the linear and rbf kernels
def kernel(params, sv, X):
    if params.kernel == 'linear':
        return [np.dot(vi, X) for vi in sv]
    elif params.kernel == 'rbf':
        return [math.exp(-params.gamma * np.dot(vi - X, vi - X)) for vi in sv]

# This replicates clf.decision_function(X)
def decision_function(params, sv, nv, a, b, X):
    # calculate the kernels
    k = kernel(params, sv, X)

    # define the start and end index for support vectors for each class
    start = [sum(nv[:i]) for i in range(len(nv))]
    end = [start[i] + nv[i] for i in range(len(nv))]

    # calculate: sum(a_p * k(x_p, x)) between every 2 classes
    c = [ sum(a[ i ][p] * k[p] for p in range(start[j], end[j])) +
          sum(a[j-1][p] * k[p] for p in range(start[i], end[i]))
                for i in range(len(nv)) for j in range(i+1,len(nv))]

    # add the intercept
    return [sum(x) for x in zip(c, b)]

# This replicates clf.predict(X)
def predict(params, sv, nv, a, b, cs, X):
    ''' params = model parameters
        sv = support vectors
        nv = # of support vectors per class
        a  = dual coefficients
        b  = intercepts 
        cs = list of class names
        X  = feature to predict       
    '''
    decision = decision_function(params, sv, nv, a, b, X)
    votes = [(i if decision[p] > 0 else j) for p,(i,j) in enumerate((i,j) 
                                           for i in range(len(cs))
                                           for j in range(i+1,len(cs)))]

    return cs[max(set(votes), key=votes.count)]

There are a lot of input arguments for predictand decision_function, but note that these are all used internally in by the model when calling predict(X). In fact, all of the arguments are accessible to you inside the model after fitting:

有一个很大的输入参数predict和decision_function，但要注意的是，这些都是内部由模型调用时使用predict(X)。事实上，在拟合后，您可以在模型内部访问所有参数：

# Create model
clf = svm.SVC(gamma=0.001, C=100.)

# Fit model using features, X, and labels, Y.
clf.fit(X, y)

# Get parameters from model
params = clf.get_params()
sv = clf.support_vectors
nv = clf.n_support_
a  = clf.dual_coef_
b  = clf._intercept_
cs = clf.classes_

# Use the functions to predict
print(predict(params, sv, nv, a, b, cs, X))

# Compare with the builtin predict
print(clf.predict(X))

Answer 5

回答by serv-inc

There's a really nice Q&Afor the multi-class one-vs-one scenario at datascience.sx:

在 datascience.sx 上有一个关于多类一对一场景的非常好的问答：

Question

题

I have a multiclass SVM classifier with labels 'A', 'B', 'C', 'D'.
This is the code I'm running:
>>>print clf.predict([predict_this])
['A']
>>>print clf.decision_function([predict_this])
[[ 185.23220833   43.62763596  180.83305074  -93.58628288   62.51448055  173.43335293]]
How can I use the output of decision function to predict the class (A/B/C/D) with the highest probability and if possible, it's value? I have visited https://stackoverflow.com/a/20114601/7760998but it is for binary classifiers and could not find a good resource which explains the output of decision_function for multiclass classifiers with shape ovo (one-vs-one).
Edit:
The above example is for class 'A'. For another input the classifier predicted 'C' and gave the following result in decision_function
[[ 96.42193513 -11.13296606 111.47424538 -88.5356536 44.29272494 141.0069203 ]]
For another different input which the classifier predicted as 'C' gave the following result from decision_function,
[[ 290.54180354 -133.93467605  116.37068951 -392.32251314 -130.84421412   284.87653043]]
Had it been ovr (one-vs-rest), it would become easier by selecting the one with higher value, but in ovo (one-vs-one) there are (n * (n - 1)) / 2values in the resulting list.
How to deduce which class would be selected based on the decision function?

我有一个多类 SVM 分类器，标签为“A”、“B”、“C”、“D”。
这是我正在运行的代码：
>>>print clf.predict([predict_this])
['A']
>>>print clf.decision_function([predict_this])
[[ 185.23220833   43.62763596  180.83305074  -93.58628288   62.51448055  173.43335293]]
如何使用决策函数的输出来预测具有最高概率的类别（A/B/C/D），如果可能的话，它的价值？我已经访问过https://stackoverflow.com/a/20114601/7760998但它是针对二元分类器的，并且找不到一个很好的资源来解释具有形状 ovo（一对一）的多类分类器的决策函数的输出。
编辑：
上面的示例适用于“A”类。对于另一个输入，分类器预测“C”并在决策函数中给出以下结果
[[ 96.42193513 -11.13296606 111.47424538 -88.5356536 44.29272494 141.0069203 ]]
对于分类器预测为“C”的另一个不同输入，decision_function 给出了以下结果，
[[ 290.54180354 -133.93467605  116.37068951 -392.32251314 -130.84421412   284.87653043]]
如果它是 ovr (one-vs-rest)，选择具有更高值的那个会变得更容易，但在 ovo (one-vs-one)(n * (n - 1)) / 2中，结果列表中有值。
如何根据决策函数推断将选择哪个类？

Answer

回答

Your link has sufficient resources, so let's go through:
When you call decision_function(), you get the output from each of the pairwise classifiers (n*(n-1)/2 numbers total). See pages 127 and 128 of "Support Vector Machines for Pattern Classification".
Click on the "page 127 and 128" link (not shown here, but in the Stackoverflow answer). You should see:
Python's SVM implementation uses one-vs-one. That's exactly what the book is talking about.
For each pairwise comparison, we measure the decision function
The decision function is the just the regular binary SVM decision boundary
What does that to do with your question?
clf.decision_function() will give you the $D$ for each pairwise comparison
The class with the most votes win
For instance,
[[ 96.42193513 -11.13296606 111.47424538 -88.5356536 44.29272494 141.0069203 ]]
is comparing:
[AB, AC, AD, BC, BD, CD]
We label each of them by the sign. We get:
[A, C, A, C, B, C]
For instance, 96.42193513 is positive and thus A is the label for AB.
Now we have three C, C would be your prediction. If you repeat my procedure for the other two examples, you will get Python's prediction. Try it!

你的链接有足够的资源，让我们来看看：
当您调用 decision_function() 时，您将获得每个成对分类器的输出（总共 n*(n-1)/2 个数字）。请参阅“用于模式分类的支持向量机”的第 127 和 128 页。
单击“第 127 页和第 128 页”链接（此处未显示，但在 Stackoverflow 答案中）。你应该看到：
Python 的 SVM 实现使用一对一。这正是本书要讨论的内容。
对于每个成对比较，我们测量决策函数
决策函数就是正则二元 SVM 决策边界
这和你的问题有什么关系？
clf.decision_function() 将为您提供每个成对比较的 $D$
得票最多的班级获胜
例如，
[[ 96.42193513 -11.13296606 111.47424538 -88.5356536 44.29272494 141.0069203 ]]
正在比较：
[AB、AC、AD、BC、BD、CD]
我们用符号标记它们中的每一个。我们得到：
[A、C、A、C、B、C]
例如，96.42193513 是正数，因此 A 是 AB 的标签。
现在我们有三个 C，C 将是您的预测。如果您对其他两个示例重复我的过程，您将得到 Python 的预测。尝试一下！

Answer 6

回答by Robin van Emden

Predict() follows a pairwise voting scheme which returns the class with most votes over all pairwise comparisons. When two classes score the same, the class with the lowest index is returned.

Predict() 遵循成对投票方案，该方案返回在所有成对比较中得票最多的类别。当两个班级得分相同时，返回索引最低的班级。

Below a Python example that applies this voting scheme to the (n*(n-1)/2 pairwise scores as returned by a one-versus-one decision_function().

下面是一个 Python 示例，该示例将此投票方案应用于由一对一决策函数（）返回的 (n*(n-1)/2 成对分数）。

from sklearn import svm
from sklearn import datasets
from numpy import argmax, zeros
from itertools import combinations

# do pairwise comparisons, return class with most +1 votes
def ovo_vote(classes, decision_function):
    combos = list(combinations(classes, 2))
    votes = zeros(len(classes))
    for i in range(len(decision_function[0])):
        if decision_function[0][i] > 0:
            votes[combos[i][0]] = votes[combos[i][0]] + 1
        else:
            votes[combos[i][1]] = votes[combos[i][1]] + 1
    winner = argmax(votes)
    return classes[winner]

# load the digits data set
digits = datasets.load_digits()

X, y = digits.data, digits.target

# set the SVC's decision function shape to "ovo"
estimator = svm.SVC(gamma=0.001, C=100., decision_function_shape='ovo')

# train SVC on all but the last digit
estimator.fit(X.data[:-1], y[:-1])

# print the value of the last digit
print("To be classified digit: ", y[-1:][0])

# print the predicted class
pred = estimator.predict(X[-1:])
print("Perform classification using predict: ", pred[0])

# get decision function
df = estimator.decision_function(X[-1:])

# print the decision function itself
print("Decision function consists of",len(df[0]),"elements:")
print(df)

# get classes, here, numbers 0 to 9
digits = estimator.classes_

# print which class has most votes
vote = ovo_vote(digits, df)
print("Perform classification using decision function: ", vote)

Python Scikit 学习 SVC 决策函数和预测

提问by Peter Tseng

回答by Martin B?schen

回答by Bilal Dadanlar

回答by RomanCobra

回答by bcorso

回答by serv-inc

Question

题

Answer

回答

回答by Robin van Emden

相关推荐

最近更新

标签

Python Scikit 学习 SVC 决策函数和预测

提问by Peter Tseng

回答by Martin B?schen

回答by Bilal Dadanlar

回答by RomanCobra

回答by bcorso

回答by serv-inc

Question

题

Answer

回答

回答by Robin van Emden

相关推荐

Python GroupBy pandas DataFrame 并选择最常见的值

Python E731 不分配 lambda 表达式，使用 def

Python 从文件指针获取文件名

pythonic是什么意思？

相关推荐

最近更新

标签