Python 如果参数完全拟合,为什么“curve_fit”不能估计参数的协方差?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41725377/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 01:31:26  来源:igfitidea点击:

Why isn't `curve_fit` able to estimate the covariance of the parameter if the parameter fits exactly?

pythonnumpyscipycurve-fittingcovariance

提问by finefoot

I don't understand curve_fitisn't able to estimate the covariance of the parameter, thus raising the OptimizeWarningbelow. The following MCVE explains my problem:

我不明白curve_fit无法估计参数的协方差,因此提高了OptimizeWarning以下内容。以下 MCVE 解释了我的问题:

MCVE python snippet

MCVE python 片段

from scipy.optimize import curve_fit
func = lambda x, a: a * x
popt, pcov = curve_fit(f = func, xdata = [1], ydata = [1])
print(popt, pcov)

Output

输出

\python-3.4.4\lib\site-packages\scipy\optimize\minpack.py:715:
OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)

[ 1.] [[ inf]]

For a = 1the function fits xdataand ydataexactly. Why isn't the error/variance 0, or something close to 0, but infinstead?

对于a = 1函数拟合xdataydata完全正确。为什么不是 error/variance0或接近0,inf而是?

There is this quote from the curve_fitSciPy Reference Guide:

curve_fitSciPy 参考指南中有这样一段话

If the Jacobian matrix at the solution doesn't have a full rank, then ‘lm' method returns a matrix filled with np.inf, on the other hand ‘trf' and ‘dogbox' methods use Moore-Penrose pseudoinverse to compute the covariance matrix.

如果解中的雅可比矩阵没有满秩,则 'lm' 方法返回一个填充了 np.inf 的矩阵,另一方面,'trf' 和 'dogbox' 方法使用 Moore-Penrose 伪逆来计算协方差矩阵。

So, what's the underlying problem? Why doesn't the Jacobian matrix at the solution have a full rank?

那么,根本的问题是什么?为什么解中的雅可比矩阵没有满秩?

采纳答案by finefoot

The formula for the covariance of the parameters (Wikipedia) has the number of degrees of freedom in the denominator. The degrees of freedoms are computed as (number of data points) - (number of parameters), which is 1 - 1 = 0 in your example. And thisis where SciPy checks the number of degrees of freedom before dividing by it.

参数协方差的公式(维基百科)在分母中有自由度数。自由度计算为(数据点数)-(参数数),在您的示例中为 1 - 1 = 0。而是由它划分之前在那里SciPy的检查自由度的数量。

With xdata = [1, 2], ydata = [1, 2]you would get zero covariance (note that the model still fits exactly: exact fit is not the problem).

随着xdata = [1, 2], ydata = [1, 2]你会得到零协方差(注意,该模型仍然完全适合:精确配合不是问题)。

This is the same sort of issue as sample variance being undefined if the sample size N is 1 (the formula for sample variance has (N-1) in the denominator). If we only took size=1 sample out of the population, we don't estimate the variance by zero, we know nothing about the variance.

如果样本大小 N 为 1(样本方差公式的分母为 (N-1)),则这与未定义样本方差是同一类问题。如果我们只从总体中取出 size=1 的样本,我们不会将方差估计为零,我们对方差一无所知。