pandas Scikit-learn - 多项逻辑回归的错误输入形状错误

Question

提问by ExtremistEnigma

I'm implementing a multinomial logistic regression model in Python using Scikit-learn. Here's my code:

我正在使用 Scikit-learn 在 Python 中实现多项逻辑回归模型。这是我的代码：

X = pd.concat([each for each in feature_cols], axis=1)
y = train[["<5", "5-6", "6-7", "7-8", "8-9", "9-10"]]
lm = LogisticRegression(multi_class='multinomial', solver='lbfgs')
lm.fit(X, y)

However, I'm getting ValueError: bad input shape (50184, 6)when it tries to execute the last line of code.

但是，ValueError: bad input shape (50184, 6)当它尝试执行最后一行代码时，我得到了。

Xis a DataFramewith 50184 rows, 7 columns. yalso has 50184 rows, but 6 columns.

X是DataFrame50184 行，7 列。y也有 50184 行，但有 6 列。

I ultimately want to predict in what bin (<5, 5-6, etc.) the outcome falls. All the independent and dependent variables used in this case are dummy columns which have a binary value of either 0 or 1. What am I missing?

我最终想预测结果落在哪个区间（<5、5-6 等）。在这种情况下使用的所有自变量和因变量都是虚拟列，它们的二进制值为 0 或 1。我错过了什么？

Answer 1

采纳答案by Stefan

The Logistic Regression 3-class Classifierexample illustrates how fitting LogisticRegressionuses a vector rather than a matrix input, in this case the targetvariable of the irisdataset, coded as values [0, 1, 2].

的Logistic回归3级分类器实施例说明如何装配LogisticRegression使用的载体，而不是一个矩阵输入，在这种情况下，target所述的可变iris的数据集，编码为值[0, 1, 2]。

To convert the dummy matrix to a series, you could multiply each column with a different integer, and then - assuming it's a pandas.DataFrame- just call .sum(axis=1)on the result. Something like:

要将虚拟矩阵转换为序列，您可以将每一列与不同的整数相乘，然后 - 假设它是一个pandas.DataFrame- 只需调用.sum(axis=1)结果。就像是：

for i, col in enumerate(y.columns.tolist(), 1):
    y.loc[:, col] *= i
y = y.sum(axis=1)

pandas Scikit-learn - 多项逻辑回归的错误输入形状错误

提问by ExtremistEnigma

采纳答案by Stefan

相关推荐

最近更新

标签

pandas Scikit-learn - 多项逻辑回归的错误输入形状错误

提问by ExtremistEnigma

采纳答案by Stefan

相关推荐

pandas 将字典转换为熊猫中的数据框列

从 Pandas 数据帧创建二维数组

pandas 在pandas中使用groupby时如何分别求和负值和正值？

pandas 如何抑制 matplotlib 警告？

相关推荐

最近更新

标签