pandas sklearn：发现样本数量不一致的输入变量：[1, 99]

Question

提问by sheldonzy

I'm trying to build a simple regression line with pandas in spyder. After executing the following code, I got this error:

我正在尝试用 spyder 中的Pandas构建一个简单的回归线。执行以下代码后，我收到此错误：

Found input variables with inconsistent numbers of samples: [1, 99]

the code:

编码：

import numpy as np
import pandas as pd

dataset = pd.read_csv('Phil.csv')

x = dataset.iloc[:, 0].values
y = dataset.iloc[:, 2].values

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x, y)

I think I know what is the problem, but I'm not quite sure how to deal with the syntax. In the variable explorer, the size of x (and y) is (99L,), and from what I remember it can't be a vector, and it must be size (99,1). same thing for y.

我想我知道问题出在哪里，但我不太确定如何处理语法。在变量资源管理器中，x（和y）的大小是（99L，），据我所知，它不能是向量，必须是大小（99,1）。y 也一样。

Saw a bunch of related topics, but none of them helped. Thanks.

看到一堆相关的话题，但没有一个有帮助。谢谢。

Answer 1

回答by Peter Mularien

Referring to the sklearn documentation for LinearRegression(http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.fit), the Xvector needs to conform to the specification [n_samples,n_features].

参考LinearRegression( http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.fit)的 sklearn 文档，X向量需要符合规范[n_samples,n_features]。

Since you have only a single feature with many samples, the shape should be (99,1) - e.g., a single value per "row" with a single "column".

由于您只有一个包含多个样本的特征，因此形状应该是 (99,1) - 例如，每个“行”有一个值，只有一个“列”。

There are many ways to accomplish this (ref: Efficient way to add a singleton dimension to a NumPy vector so that slice assignments work), in your case, the following should work:

有很多方法可以实现这一点（参考：向 NumPy 向量添加单例维度以便切片分配工作的有效方法），在您的情况下，以下应该起作用：

regressor.fit(x[:, None], y)

Don't forget that predictrequires the same shape to the data!

不要忘记，predict需要与数据相同的形状！

pandas sklearn：发现样本数量不一致的输入变量：[1, 99]

提问by sheldonzy

回答by Peter Mularien

相关推荐

最近更新

标签

pandas sklearn：发现样本数量不一致的输入变量：[1, 99]

提问by sheldonzy

回答by Peter Mularien

相关推荐

在 Pandas 计算中处理除以零

pandas 将熊猫数据框附加到 Google 电子表格

pandas 如何为我的数据集创建多线图？

pandas 仅将日期时间列与熊猫中的时间进行比较

相关推荐

最近更新

标签