Python中的均方误差

Question

提问by Keithx

I'm trying to made function that will calculate mean squared error from y (true values) and y_pred (predicted ones) not using sklearn or other implementations.

我正在尝试使用不使用 sklearn 或其他实现的函数来计算 y（真值）和 y_pred（预测值）的均方误差。

I'll try next:

我接下来试试：

def mserror(y, y_pred):
    i=0
    for i in range (len(y)):
        i+=1
        mse = ((y - y_pred) ** 2).mean(y)   
        return mse

Can you please correct me what I m doing wrong with the calculation and who it can be fixed?

你能纠正我我在计算中做错了什么以及可以修复谁吗？

Answer 1

回答by percusse

You are modifying the index for no reason. A for loop increments it anyways. Also, you are not using the index, for example, you are not using any y[i] - y_pred[i], hence you don't need the loop at all.

您正在无缘无故地修改索引。for 循环无论如何都会增加它。此外，您没有使用索引，例如，您没有使用 any y[i] - y_pred[i]，因此您根本不需要循环。

Use the arrays

使用数组

mse = np.mean((y - y_pred)**2)

Answer 2

回答by rotem

I would say :

我会说：

def get_mse(y, y_pred):
d1 = y - y_pred
mse = (1/N)*d1.dot(d1) # N is int(len(y))
return mse

it would only work if y and y_pred are numpy arrays, but you would want them to be numpy arrays as long as you decide not to use other libraries so you can do math operations on it.

它仅在 y 和 y_pred 是 numpy 数组时才有效，但只要您决定不使用其他库，您就会希望它们是 numpy 数组，以便您可以对其进行数学运算。

numpy dot() function is the dot product of 2 numpy arrays (you can also write np.dot(d1, d1) )

numpy dot() 函数是 2 个 numpy 数组的点积（你也可以写成 np.dot(d1, d1) ）

Answer 3

回答by Ramzan Shahid

Here's how to implement MSE in python:

下面是如何在python中实现MSE：

def mse_metric(actual, predicted):
    sum_error = 0.0
    # loop over all values
    for i in range(len(actual)):
        # the error is the sum of (actual - prediction)^2
        prediction_error =  actual[i] - predicted[i]
        sum_error += (prediction_error ** 2)
    # now normalize
    mean_error = sum_error / float(len(actual))
    return (mean_error)

Answer 4

回答by Ramzan Shahid

firstly, you are using the i repeatedly and increments it but in range it is automatically iterative to next number. So don't use i again. The other thing that you are taking the mean of y but instead of taking mean of this, take the mean of ((y - y_pred) ** 2). I hope, you got the point.

首先，您重复使用 i 并增加它，但在范围内它会自动迭代到下一个数字。所以不要再使用 i 了。另一件事是你取 y 的平均值，而不是取这个的平均值，取 ((y - y_pred) ** 2) 的平均值。我希望，你明白了。

Python中的均方误差

提问by Keithx

回答by percusse

回答by rotem

回答by Ramzan Shahid

回答by Ramzan Shahid

相关推荐

最近更新

标签

Python中的均方误差

提问by Keithx

回答by percusse

回答by rotem

回答by Ramzan Shahid

回答by Ramzan Shahid

相关推荐

Python 使用 Pyspark 和 Hive 显示来自特定数据库的表

Python AnalysisException: u"cannot resolve 'name' given input columns: [list] in sqlContext in spark

在python中合并数据帧时出现重复的行

Python：使用多处理从不同进程附加到同一列表

相关推荐

最近更新

标签