Python 当需要一维数组时传递列向量 y
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34165731/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
A column-vector y was passed when a 1d array was expected
提问by Klausos Klausos
I need to fit RandomForestRegressor
from sklearn.ensemble
.
我需要RandomForestRegressor
从sklearn.ensemble
.
forest = ensemble.RandomForestRegressor(**RF_tuned_parameters)
model = forest.fit(train_fold, train_y)
yhat = model.predict(test_fold)
This code always worked until I made some preprocessing of data (train_y
).
The error message says:
这段代码一直有效,直到我对数据 ( train_y
)进行了一些预处理。错误消息说:
DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
model = forest.fit(train_fold, train_y)
DataConversionWarning:当需要一维数组时传递了列向量 y。请将 y 的形状更改为 (n_samples,),例如使用 ravel()。
模型 = Forest.fit(train_fold, train_y)
Previously train_y
was a Series, now it's numpy array (it is a column-vector). If I apply train_y.ravel()
, then it becomes a row vector and no error message appears, through the prediction step takes very long time (actually it never finishes...).
以前train_y
是一个系列,现在是 numpy 数组(它是一个列向量)。如果我申请train_y.ravel()
,那么它会变成一个行向量并且不会出现错误消息,通过预测步骤需要很长时间(实际上它永远不会完成......)。
In the docs of RandomForestRegressor
I found that train_y
should be defined as y : array-like, shape = [n_samples] or [n_samples, n_outputs]
Any idea how to solve this issue?
在RandomForestRegressor
我发现的文档中,train_y
应该定义为y : array-like, shape = [n_samples] or [n_samples, n_outputs]
任何想法如何解决这个问题?
回答by Linda MacPhee-Cobb
Change this line:
改变这一行:
model = forest.fit(train_fold, train_y)
to:
到:
model = forest.fit(train_fold, train_y.values.ravel())
Edit:
编辑:
.values
will give the values in an array. (shape: (n,1)
.values
将给出数组中的值。(形状:(n,1)
.ravel
will convert that array shape to (n, )
.ravel
将该数组形状转换为 (n, )
回答by Coral
use below code:
使用以下代码:
model = forest.fit(train_fold, train_y.ravel())
if you are still getting slap by error as identical as below ?
如果您仍然被错误地打耳光,如下所示?
Unknown label type: %r" % y
use this code:
使用此代码:
y = train_y.ravel()
train_y = np.array(y).astype(int)
model = forest.fit(train_fold, train_y)
回答by Simon Leung
I also encountered this situation when I was trying to train a KNNclassifier. but it seems that the warning was gone after I changed:knn.fit(X_train,y_train)
toknn.fit(X_train, np.ravel(y_train,order='C'))
我在尝试训练KNN分类器时也遇到过这种情况。但似乎在警告不见了,我改变之后:knn.fit(X_train,y_train)
以knn.fit(X_train, np.ravel(y_train,order='C'))
Ahead of this line I used import numpy as np
.
在这条线之前,我使用了import numpy as np
.
回答by sushmit
Another way of doing this is to use ravel
另一种方法是使用 ravel
model = forest.fit(train_fold, train_y.values.reshape(-1,))
回答by mohammad hassan bigdeli shamlo
I had the same problem. The problem was that the labels were in a column format while it expected it in a row.
use np.ravel()
我有同样的问题。问题是标签是列格式,而它期望它排成一行。用np.ravel()
knn.score(training_set, np.ravel(training_labels))
Hope this solves it.
希望这能解决它。
回答by AlexB
With neuraxle, you can easily solve this :
使用neuraxle,您可以轻松解决这个问题:
p = Pipeline([
# expected outputs shape: (n, 1)
OutputTransformerWrapper(NumpyRavel()),
# expected outputs shape: (n, )
RandomForestRegressor(**RF_tuned_parameters)
])
p, outputs = p.fit_transform(data_inputs, expected_outputs)
Neuraxle is a sklearn-like framework for hyperparameter tuning and AutoML in deep learning projects !
Neuraxle 是一个类似于 sklearn 的框架,用于深度学习项目中的超参数调整和 AutoML!
回答by Bibby Wang
format_train_y=[]
for n in train_y:
format_train_y.append(n[0])