Python 将预测结果保存为 CSV

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34864695/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 15:38:57  来源:igfitidea点击:

Saving prediction results to CSV

pythonnumpypandasscikit-learn

提问by ZJAY

I am storing the results from a sklearn regression model to the varibla prediction.

我将 sklearn 回归模型的结果存储到 varibla 预测。

prediction = regressor.predict(data[['X']])
print(prediction)

The values of the prediction output looks like this

预测输出的值如下所示

[ 266.77832991  201.06347505  446.00066136  499.76736079  295.15519906
  214.50514991  422.1043505   531.13126879  287.68760191  201.06347505
  402.68859792  478.85808879  286.19408248  192.10235848]

I am then trying to use the to_csv function to save the results to a local CSV file:

然后我尝试使用 to_csv 函数将结果保存到本地 CSV 文件:

prediction.to_csv('C:/localpath/test.csv')

But the error I get back is:

但我回来的错误是:

AttributeError: 'numpy.ndarray' object has no attribute 'to_csv'

I am using Pandas/Numpy/SKlearn. Any idea on the basic fix?

我正在使用 Pandas/Numpy/SKlearn。关于基本修复的任何想法?

回答by DavidK

You can use pandas. As it's said, numpy arrays don't have a to_csv function.

你可以使用熊猫。如前所述,numpy 数组没有 to_csv 函数。

import numpy as np
import pandas as pd
prediction = pd.DataFrame(predictions, columns=['predictions']).to_csv('prediction.csv')

add ".T" if you want either your values in line or column-like.

如果您希望您的值成行或列状,请添加“.T”。

回答by Ali

You can use the numpy.savetxtfunction:

您可以使用该numpy.savetxt功能:

numpy.savetxt('C:/localpath/test.csv',prediction, ,delimiter=',')

and to load a CSV file you can use numpy.genfromtxtfunction:

并加载一个 CSV 文件,您可以使用numpy.genfromtxt函数:

numpy.genfromtxt('C:/localpath/test.csv', delimiter=',')

回答by Ilker Kurtulus

It is a very detailed solution cases like those but you can use it even in production.

这是一个非常详细的解决方案案例,但您甚至可以在生产中使用它。

First Save the Model

首先保存模型

joblib.dump(regressor, "regressor.sav")

Save columns in order

按顺序保存列

pd.DataFrame(X_train.columns).to_csv("feature_list.csv", index = None)

Save data types of train set

保存训练集的数据类型

pd.DataFrame(X_train.dtypes).reset_index().to_csv("data_types.csv", index = None)

Using it again:

再次使用它:

feature_list = pd.read_csv("feature_list.csv")
feature_list = pd.Index(list(feature_list["0"]))

add_cols = list(feature_list.difference(X_test.columns))

drop_cols = list(X_test.columns.difference(feature_list))

for col in add_cols:
    X_test[col] = np.nan

for col in drop_cols:
    X_test = X_test.drop(col, axis = 1)

#?reorder columns
X_test = X_test[feature_list]

types = pd.read_csv("data_types.csv")
for i in range(len(types)):
    X_test[types.iloc[i,0]] = X_test[types.iloc[i,0]].astype(types.iloc[i,1])

Make Predictions

作出预测

regressor = joblib.load("regressor.sav")
predictions = regressor.predict(X_test)

Save Prediction Results

保存预测结果

res = pd.DataFrame(predictions)
res.index = X_test.index # its important for comparison
res.columns = ["prediction"]
res.to_csv("prediction_results.csv")

Enjoy end to end model/prediction saver code!

享受端到端模型/预测保护程序代码!