pandas 错误:找到带有暗淡 3 的数组。估计器预期 <= 2

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34866548/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:31:40  来源:igfitidea点击:

Error: Found array with dim 3. Estimator expected <= 2

pythonnumpypandasscikit-learn

提问by ZJAY

I have a 14x5 data matrix titled data. The first column (Y) is the dependent variable followed by 4 independent variables (X,S1,S2,S3). When trying to fit a regression model to a subset of the independent variables ['S2'][:T] I get the following error:

我有一个 14x5 的数据矩阵,名为 data。第一列 (Y) 是因变量,后跟 4 个自变量 (X,S1,S2,S3)。当尝试将回归模型拟合到自变量 ['S2'][:T] 的子集时,我收到以下错误:

ValueError: Found array with dim 3. Estimator expected <= 2.

I'd appreciate any insight on a fix. Code below.

我很感激任何有关修复的见解。代码如下。

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression


data = pd.read_csv('C:/path/Macro.csv')
T=len(data['X'])-1

#Fit variables
X = data['X'][:T]
S1 = data['S1'][:T]
S2 = data['S2'][:T]
S3 = data['S3'][:T]
Y = data['Y'][:T]

regressor = LinearRegression()
regressor.fit([[X,S1,S2,S3]], Y)

回答by Igor Raush

You are passing a 3-dimensional array as the first argument to fit(). X, S1, S2, S3 are all Seriesobjects (1-dimensional), so the following

您将一个 3 维数组作为第一个参数传递给fit()。X、S1、S2、S3都是Series对象(一维),所以下面

[[X, S1, S2, S3]]

is 3-dimensional. sklearnestimators expect an array of feature vectors (2-dimensional).

是 3 维的。sklearn估计器需要一组特征向量(二维)。

Try something like this:

尝试这样的事情:

# pandas indexing syntax
# data.ix[ row index/slice, column index/slice ]

X = data.ix[:T, 'X':]  # rows up to T, columns from X onward
y = data.ix[:T, 'Y']   # rows up to T, Y column
regressor = LinearRegression()
regressor.fit(X, y)