Python numpy.polyfit 的错误是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15721053/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What's the error of numpy.polyfit?
提问by varantir
I want to use numpy.polyfitfor physical calculations, therefore I need the magnitude of the error.
我想numpy.polyfit用于物理计算,因此我需要误差的大小。
采纳答案by Jaime
If you specify full=Truein your call to polyfit, it will include extra information:
如果您full=True在对 的调用中指定polyfit,它将包含额外信息:
>>> x = np.arange(100)
>>> y = x**2 + 3*x + 5 + np.random.rand(100)
>>> np.polyfit(x, y, 2)
array([ 0.99995888, 3.00221219, 5.56776641])
>>> np.polyfit(x, y, 2, full=True)
(array([ 0.99995888, 3.00221219, 5.56776641]), # coefficients
array([ 7.19260721]), # residuals
3, # rank
array([ 11.87708199, 3.5299267 , 0.52876389]), # singular values
2.2204460492503131e-14) # conditioning threshold
The residual value returned is the sum of the squares of the fit errors, not sure if this is what you are after:
返回的残差值是拟合误差的平方和,不确定这是否是您所追求的:
>>> np.sum((np.polyval(np.polyfit(x, y, 2), x) - y)**2)
7.1926072073491056
In version 1.7 there is also a covkeyword that will return the covariance matrix for your coefficients, which you could use to calculate the uncertainty of the fit coefficients themselves.
在 1.7 版中,还有一个cov关键字将返回系数的协方差矩阵,您可以使用它来计算拟合系数本身的不确定性。
回答by askewchan
As you can see in the documentation:
正如您在文档中看到的:
Returns
-------
p : ndarray, shape (M,) or (M, K)
Polynomial coefficients, highest power first.
If `y` was 2-D, the coefficients for `k`-th data set are in ``p[:,k]``.
residuals, rank, singular_values, rcond : present only if `full` = True
Residuals of the least-squares fit, the effective rank of the scaled
Vandermonde coefficient matrix, its singular values, and the specified
value of `rcond`. For more details, see `linalg.lstsq`.
Which means that if you can do a fit and get the residuals as:
这意味着,如果您可以进行拟合并得到残差为:
import numpy as np
x = np.arange(10)
y = x**2 -3*x + np.random.random(10)
p, res, _, _, _ = numpy.polyfit(x, y, deg, full=True)
Then, the pare your fit parameters, and the reswill be the residuals, as described above. The _'s are because you don't need to save the last three parameters, so you can just save them in the variable _which you won't use. This is a convention and is not required.
然后,p是您的拟合参数,res将是残差,如上所述。该_的是因为你并不需要保存最后三个参数,所以你可以将它们保存在变_,你不会使用。这是惯例,不是必需的。
@Jaime's answer explains what the residual means. Another thing you can do is look at those squared deviations as a function (the sum of which is res). This is particularly helpful to see a trend that didn't fit sufficiently. rescan be large because of statistical noise, or possibly systematic poor fitting, for example:
@Jaime 的回答解释了残差的含义。您可以做的另一件事是将这些平方偏差视为一个函数(其总和为res)。这对于查看不完全符合的趋势特别有帮助。 res由于统计噪声或可能的系统拟合不良,可能会很大,例如:
x = np.arange(100)
y = 1000*np.sqrt(x) + x**2 - 10*x + 500*np.random.random(100) - 250
p = np.polyfit(x,y,2) # insufficient degree to include sqrt
yfit = np.polyval(p,x)
figure()
plot(x,y, label='data')
plot(x,yfit, label='fit')
plot(x,yfit-y, label='var')
So in the figure, note the bad fit near x = 0:
所以在图中,注意附近的不合适x = 0:

