Python 当需要一维数组时传递列向量 y

Question

提问by Klausos Klausos

I need to fit RandomForestRegressorfrom sklearn.ensemble.

我需要RandomForestRegressor从sklearn.ensemble.

forest = ensemble.RandomForestRegressor(**RF_tuned_parameters)
model = forest.fit(train_fold, train_y)
yhat = model.predict(test_fold)

This code always worked until I made some preprocessing of data (train_y). The error message says:

这段代码一直有效，直到我对数据 ( train_y)进行了一些预处理。错误消息说：

DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
model = forest.fit(train_fold, train_y)

DataConversionWarning：当需要一维数组时传递了列向量 y。请将 y 的形状更改为 (n_samples,)，例如使用 ravel()。
模型 = Forest.fit(train_fold, train_y)

Previously train_ywas a Series, now it's numpy array (it is a column-vector). If I apply train_y.ravel(), then it becomes a row vector and no error message appears, through the prediction step takes very long time (actually it never finishes...).

以前train_y是一个系列，现在是 numpy 数组（它是一个列向量）。如果我申请train_y.ravel()，那么它会变成一个行向量并且不会出现错误消息，通过预测步骤需要很长时间（实际上它永远不会完成......）。

In the docs of RandomForestRegressorI found that train_yshould be defined as y : array-like, shape = [n_samples] or [n_samples, n_outputs]Any idea how to solve this issue?

在RandomForestRegressor我发现的文档中，train_y应该定义为y : array-like, shape = [n_samples] or [n_samples, n_outputs]任何想法如何解决这个问题？

Answer 1

回答by Linda MacPhee-Cobb

Change this line:

改变这一行：

model = forest.fit(train_fold, train_y)

to:

到：

model = forest.fit(train_fold, train_y.values.ravel())

Edit:

编辑：

.valueswill give the values in an array. (shape: (n,1)

.values将给出数组中的值。（形状：（n,1）

.ravelwill convert that array shape to (n, )

.ravel将该数组形状转换为 (n, )

Answer 2

回答by Coral

use below code:

使用以下代码：

model = forest.fit(train_fold, train_y.ravel())

if you are still getting slap by error as identical as below ?

如果您仍然被错误地打耳光，如下所示？

Unknown label type: %r" % y

use this code:

使用此代码：

y = train_y.ravel()
train_y = np.array(y).astype(int)
model = forest.fit(train_fold, train_y)

Answer 3

回答by Simon Leung

I also encountered this situation when I was trying to train a KNNclassifier. but it seems that the warning was gone after I changed:
knn.fit(X_train,y_train)
to
knn.fit(X_train, np.ravel(y_train,order='C'))

我在尝试训练KNN分类器时也遇到过这种情况。但似乎在警告不见了，我改变之后：
knn.fit(X_train,y_train)
以
knn.fit(X_train, np.ravel(y_train,order='C'))

Ahead of this line I used import numpy as np.

在这条线之前，我使用了import numpy as np.

Answer 4

回答by sushmit

Another way of doing this is to use ravel

另一种方法是使用 ravel

model = forest.fit(train_fold, train_y.values.reshape(-1,))

Answer 5

回答by mohammad hassan bigdeli shamlo

I had the same problem. The problem was that the labels were in a column format while it expected it in a row. use np.ravel()

我有同样的问题。问题是标签是列格式，而它期望它排成一行。用np.ravel()

knn.score(training_set, np.ravel(training_labels))

Hope this solves it.

希望这能解决它。

Answer 6

回答by AlexB

With neuraxle, you can easily solve this :

使用neuraxle，您可以轻松解决这个问题：

p = Pipeline([
   # expected outputs shape: (n, 1)
   OutputTransformerWrapper(NumpyRavel()), 
   # expected outputs shape: (n, )
   RandomForestRegressor(**RF_tuned_parameters)
])

p, outputs = p.fit_transform(data_inputs, expected_outputs)

Neuraxle is a sklearn-like framework for hyperparameter tuning and AutoML in deep learning projects !

Neuraxle 是一个类似于 sklearn 的框架，用于深度学习项目中的超参数调整和 AutoML！

Answer 7

回答by Bibby Wang

format_train_y=[]
for n in train_y:
    format_train_y.append(n[0])

Python 当需要一维数组时传递列向量 y

提问by Klausos Klausos

回答by Linda MacPhee-Cobb

回答by Coral

回答by Simon Leung

回答by sushmit

回答by mohammad hassan bigdeli shamlo

回答by AlexB

回答by Bibby Wang

相关推荐

最近更新

标签

Python 当需要一维数组时传递列向量 y

提问by Klausos Klausos

回答by Linda MacPhee-Cobb

回答by Coral

回答by Simon Leung

回答by sushmit

回答by mohammad hassan bigdeli shamlo

回答by AlexB

回答by Bibby Wang

相关推荐

Python 通过 Django 模板中的键访问字典

Python 如何在迭代熊猫数据框时创建新列并插入行值

Python 退出命令 - 为什么有这么多以及何时应该使用每个命令？

Python AWS：找不到配置文件 (MyName)

相关推荐

最近更新

标签