Python Logistic 回归中正则化强度的倒数是多少？它应该如何影响我的代码？

Question

提问by user3427495

I am using sklearn.linear_model.LogisticRegressionin scikit learnto run a Logistic Regression.

我正在使用sklearn.linear_model.LogisticRegressioninscikit learn运行逻辑回归。

C : float, optional (default=1.0) Inverse of regularization strength;
    must be a positive float. Like in support vector machines, smaller
    values specify stronger regularization.

What does Cmean here in simple terms please? What is regularization strength?

请问C这里用简单的术语是什么意思？什么是正则化强度？

Answer 1

采纳答案by TooTone

Regularizationis applying a penalty to increasing the magnitude of parameter values in order to reduce overfitting. When you train a model such as a logistic regression model, you are choosing parameters that give you the best fit to the data. This means minimizing the error between what the model predicts for your dependent variable given your data compared to what your dependent variable actually is.

正则化是对增加参数值的大小施加惩罚，以减少过拟合。当您训练逻辑回归模型等模型时，您是在选择最适合数据的参数。这意味着最小化模型对给定数据的因变量预测值与因变量实际值之间的误差。

The problem comes when you have a lot of parameters (a lot of independent variables) but not too much data. In this case, the model will often tailor the parameter values to idiosyncrasies in your data -- which means it fits your data almost perfectly. However because those idiosyncrasies don't appear in future data you see, your model predicts poorly.

当您有很多参数（很多自变量）但没有太多数据时，问题就会出现。在这种情况下，模型通常会根据数据中的特性定制参数值——这意味着它几乎完美地适合您的数据。但是，由于这些特性不会出现在您看到的未来数据中，因此您的模型预测效果不佳。

To solve this, as well as minimizing the error as already discussed, you add to what is minimized and also minimize a function that penalizes large values of the parameters. Most often the function is λΣθ_j², which is some constant λ times the sum of the squared parameter values θ_j². The larger λ is the less likely it is that the parameters will be increased in magnitude simply to adjust for small perturbations in the data. In your case however, rather than specifying λ, you specify C=1/λ.

为了解决这个问题，以及将已经讨论过的误差最小化，您可以添加最小化的内容并最小化一个惩罚大参数值的函数。最常见的函数是 λΣθ _j²，它是某个常数 λ 乘以平方参数值 θ _j²的总和。λ 越大，参数增加幅度的可能性就越小，只是为了调整数据中的小扰动。但是，在您的情况下，您不是指定 λ，而是指定 C=1/λ。

Python Logistic 回归中正则化强度的倒数是多少？它应该如何影响我的代码？

提问by user3427495

采纳答案by TooTone

相关推荐

最近更新

标签

Python Logistic 回归中正则化强度的倒数是多少？它应该如何影响我的代码？

提问by user3427495

采纳答案by TooTone

相关推荐

在 Windows 上使用 IDLE 安装 python 模块/包

属性错误：对象没有属性 Python

Python 内联 for 循环

为什么 cv2.imshow() 在我的 python 编译器中导致错误？

相关推荐

最近更新

标签