python - 如何将 numpy 数组附加到 Pandas 数据帧

Question

提问by DBE7

I have trained a Logistic Regression classifier to predict whether a review is positive or negative. Now, I want to append the predicted probabilities returned by the predict_proba-function to my Pandas data frame containing the reviews. I tried doing something like:

我训练了一个逻辑回归分类器来预测评论是正面的还是负面的。现在，我想将predict_proba- 函数返回的预测概率附加到包含评论的 Pandas 数据框中。我尝试做类似的事情：

test_data['prediction'] = sentiment_model.predict_proba(test_matrix)

Obviously, that doesn't work, since predict_probareturns a 2D-numpy array. So, what is the most efficient way of doing this? I created test_matrixwith SciKit-Learn's CountVectorizer:

显然，这不起作用，因为predict_proba返回一个 2D-numpy 数组。那么，这样做最有效的方法是什么？我test_matrix使用 SciKit-Learn 的 CountVectorizer创建：

vectorizer = CountVectorizer(token_pattern=r'\b\w+\b')
train_matrix = vectorizer.fit_transform(train_data['review_clean'].values.astype('U'))
test_matrix = vectorizer.transform(test_data['review_clean'].values.astype('U'))

Sample data looks like:

示例数据如下所示：

| Review                                     | Prediction         |                      
| ------------------------------------------ | ------------------ |
| "Toy was great! Our six-year old loved it!"|   0.986            |

Answer 1

回答by Karthik Arumugham

Assign the predictions to a variable and then extract the columns from the variable to be assigned to the pandas dataframe cols. If xis the 2D numpy array with predictions,

将预测分配给变量，然后从变量中提取要分配给熊猫数据框 cols 的列。如果x是带有预测的二维 numpy 数组，

x = sentiment_model.predict_proba(test_matrix)

then you can do,

那么你可以这样做

test_data['prediction0'] = x[:,0]
test_data['prediction1'] = x[:,1]

python - 如何将 numpy 数组附加到 Pandas 数据帧

提问by DBE7

回答by Karthik Arumugham

相关推荐

最近更新

标签

python - 如何将 numpy 数组附加到 Pandas 数据帧

提问by DBE7

回答by Karthik Arumugham

相关推荐

如何将我的 python spyder 与 github 连接？

Python 如何在 TensorFlow 中调试 NaN 值？

Python “DataFrame”对象没有属性“reshape”

Python 访问运行在 Docker 容器上的 Jupyter notebook

相关推荐

最近更新

标签