Python scikit-learn：如何以百分比计算均方根误差 (RMSE)？

Question

提问by Desta Haileselassie Hagos

I have a dataset (found in this link: https://drive.google.com/open?id=0B2Iv8dfU4fTUY2ltNGVkMG05V00) of the following format.

我有以下格式的数据集（在此链接中找到：https: //drive.google.com/open?id=0B2Iv8dfU4fTUY2ltNGVkMG05V00）。

 time     X   Y
0.000543  0  10
0.000575  0  10
0.041324  1  10
0.041331  2  10
0.041336  3  10
0.04134   4  10
  ...
9.987735  55 239
9.987739  56 239
9.987744  57 239
9.987749  58 239
9.987938  59 239

The third column (Y) in my dataset is my true value - that's what I wanted to predict (estimate). I want to do a prediction of Y(i.e. predict the current value of Yaccording to the previous 100 rolling values of X. For this, I have the following pythonscript work using random forest regression model.

我数据集中的第三列 (Y) 是我的真实值 - 这就是我想要预测（估计）的值。我想做一个预测Y（即Y根据的前 100 个滚动值预测的当前值X。为此，我python使用random forest regression model.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""

@author: deshag
"""

import pandas as pd
import numpy as np
from io import StringIO
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
from math import sqrt



df = pd.read_csv('estimated_pred.csv')

for i in range(1,100):
    df['X_t'+str(i)] = df['X'].shift(i)

print(df)

df.dropna(inplace=True)


X=pd.DataFrame({ 'X_%d'%i : df['X'].shift(i) for i in range(100)}).apply(np.nan_to_num, axis=0).values


y = df['Y'].values


reg = RandomForestRegressor(criterion='mse')
reg.fit(X,y)
modelPred = reg.predict(X)
print(modelPred)

print("Number of predictions:",len(modelPred))

meanSquaredError=mean_squared_error(y, modelPred)
print("MSE:", meanSquaredError)
rootMeanSquaredError = sqrt(meanSquaredError)
print("RMSE:", rootMeanSquaredError)

At the end, I measured the root-mean-square error (RMSE) and got an RMSEof 19.57. From what I have read from the documentation, it says that squared errors have the same units as of the response. Is there any way to present the value of an RMSEin percentage? For example, to say this percent of the prediction is correct and this much wrong.

在结束时，我测量根均方误差（RMSE）与有一个RMSE的19.57。从我从文档中读到的内容来看，平方误差的单位与响应的单位相同。有没有办法以RMSE百分比形式呈现值？例如，说这个百分比的预测是正确的，而这个百分比是错误的。

There is a check_arrayfunction for calculating mean absolute percentage error (MAPE)in the recent version of sklearnbut it doesn't seem to work the same way as the previous version when i try it as in the following.

在最近的版本中有一个check_array计算函数mean absolute percentage error (MAPE)，sklearn但是当我如下尝试时，它的工作方式似乎与以前的版本不同。

import numpy as np
from sklearn.utils import check_array

def calculate_mape(y_true, y_pred): 
y_true, y_pred = check_array(y_true, y_pred)

    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

calculate_mape(y, modelPred)

This is returning an error: ValueError: not enough values to unpack (expected 2, got 1). And this seems to be that the check_arrayfunction in the recent version returns only a single value, unlike the previous version.

这是返回一个错误：ValueError: not enough values to unpack (expected 2, got 1)。而且这似乎是check_array最近版本中的函数只返回一个value，与以前的版本不同。

Is there any way to present the RMSEin percentage or calculate MAPEusing sklearnfor Python?

有什么方法可以显示RMSE百分比或MAPE使用sklearnfor计算Python？

Answer 1

回答by Imran

Your implementation of calculate_mapeis not working because you are expecting the check_arraysfunction, which was removed in sklearn 0.16. check_arrayis not what you want.

您的实现calculate_mape不起作用，因为您期待check_arrays在sklearn 0.16. check_array不是你想要的。

ThisStackOverflow answer gives a working implementation.

这个StackOverflow 答案给出了一个有效的实现。

Python scikit-learn：如何以百分比计算均方根误差 (RMSE)？

提问by Desta Haileselassie Hagos

回答by Imran

相关推荐

最近更新

标签

Python scikit-learn：如何以百分比计算均方根误差 (RMSE)？

提问by Desta Haileselassie Hagos

回答by Imran

相关推荐

Python 在函数中打印返回值

如何使用 Python 在文件夹中保存多个图？

如何将文件复制到 Python 脚本中的特定文件夹？

Python 如何将 Pandas DataFrame 表保存为 png

相关推荐

最近更新

标签