Python 如何将 scikit-learn 的 LogisticRegression 应用于一些十进制数据?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18030048/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:46:15  来源:igfitidea点击:

How do I apply scikit-learn's LogisticRegression for some decimal data?

pythonmachine-learningscikit-learnlinear-regressionlogistics

提问by WoooHaaaa

I've the training data set like this:

我有这样的训练数据集:

0.00479616 |  0.0119904 |  0.00483092 |  0.0120773 | 1
0.51213136 |  0.0113404 |  0.02383092 |  -0.012073 | 0
0.10479096 |  -0.011704 |  -0.0453692 |  0.0350773 | 0

The first 4 columns is features of one sample and the last column is its output.

前 4 列是一个样本的特征,最后一列是其输出。

I use scikit this way :

我这样使用 scikit:

  data = np.array(data)
  lr = linear_model.LogisticRegression(C=10)

  X = data[:,:-1]
  Y = data[:,-1]
  lr.fit(X, Y)

  print lr
  # The output is always 1 or 0, not a probability number.
  print lr.predict(data[0][:-1])

I thought Logistic Regression always should gives a probability number between 0 and 1.

我认为逻辑回归总是应该给出一个介于 0 和 1 之间的概率数。

采纳答案by Fred Foo

Use the predict_probamethod to get probabilities. predictgives class labels.

使用该predict_proba方法获取概率。predict给出类标签。

>>> lr = LogisticRegression()
>>> X = np.random.randn(3, 4)
>>> y = [1, 0, 0]
>>> lr.fit(X, y)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)
>>> lr.predict_proba(X[0])
array([[ 0.49197272,  0.50802728]])

(If you had read the documentation, you would have found this out.)

(如果您已阅读文档,就会发现这一点。)