Python 无需初始猜测即可拟合指数衰减

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3938042/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 13:25:11  来源:igfitidea点击:

fitting exponential decay with no initial guessing

pythonnumpyscipy

提问by George Karpenkov

Does anyone know a scipy/numpy module which will allow to fit exponential decay to data?

有谁知道一个 scipy/numpy 模块可以让数据适应指数衰减?

Google search returned a few blog posts, for example - http://exnumerus.blogspot.com/2010/04/how-to-fit-exponential-decay-example-in.html, but that solution requires y-offset to be pre-specified, which is not always possible

谷歌搜索返回了一些博客文章,例如 - http://exnumerus.blogspot.com/2010/04/how-to-fit-exponential-decay-example-in.html,但该解决方案需要 y-offset预先指定,这并不总是可能的

EDIT:

编辑:

curve_fit works, but it can fail quite miserably with no initial guess for parameters, and that is sometimes needed. The code I'm working with is

curve_fit 工作,但它可能会非常悲惨地失败,没有对参数的初始猜测,这有时是需要的。我正在使用的代码是

#!/usr/bin/env python
import numpy as np
import scipy as sp
import pylab as pl
from scipy.optimize.minpack import curve_fit

x = np.array([  50.,  110.,  170.,  230.,  290.,  350.,  410.,  470.,  
530.,  590.])
y = np.array([ 3173.,  2391.,  1726.,  1388.,  1057.,   786.,   598.,   
443.,   339.,   263.])

smoothx = np.linspace(x[0], x[-1], 20)

guess_a, guess_b, guess_c = 4000, -0.005, 100
guess = [guess_a, guess_b, guess_c]

exp_decay = lambda x, A, t, y0: A * np.exp(x * t) + y0

params, cov = curve_fit(exp_decay, x, y, p0=guess)

A, t, y0 = params

print "A = %s\nt = %s\ny0 = %s\n" % (A, t, y0)

pl.clf()
best_fit = lambda x: A * np.exp(t * x) + y0

pl.plot(x, y, 'b.')
pl.plot(smoothx, best_fit(smoothx), 'r-')
pl.show()

which works, but if we remove "p0=guess", it fails miserably.

这是有效的,但如果我们删除“p0=guess”,它就会惨遭失败。

采纳答案by Joe Kington

You have two options:

您有两个选择:

  1. Linearize the system, and fit a line to the log of the data.
  2. Use a non-linear solver (e.g. scipy.optimize.curve_fit
  1. 将系统线性化,并在数据日志中拟合一条线。
  2. 使用非线性求解器(例如 scipy.optimize.curve_fit

The first option is by far the fastest and most robust. However, it requires that you know the y-offset a-priori, otherwise it's impossible to linearize the equation. (i.e. y = A * exp(K * t)can be linearized by fitting y = log(A * exp(K * t)) = K * t + log(A), but y = A*exp(K*t) + Ccan only be linearized by fitting y - C = K*t + log(A), and as yis your independent variable, Cmust be known beforehand for this to be a linear system.

第一个选项是迄今为止最快和最强大的。但是,它要求您先验地知道 y 偏移量,否则无法将方程线性化。(即y = A * exp(K * t)可以通过拟合线性化y = log(A * exp(K * t)) = K * t + log(A),但y = A*exp(K*t) + C只能通过拟合线性化y - C = K*t + log(A),正如y你的自变量一样,C必须事先知道这是一个线性系统。

If you use a non-linear method, it's a) not guaranteed to converge and yield a solution, b) will be much slower, c) gives a much poorer estimate of the uncertainty in your parameters, and d) is often much less precise. However, a non-linear method has one huge advantage over a linear inversion: It can solve a non-linear system of equations. In your case, this means that you don't have to know Cbeforehand.

如果您使用非线性方法,则 a) 不能保证收敛并产生解决方案,b) 会慢得多,c) 对参数中的不确定性的估计要差得多,而 d) 通常不那么精确. 然而,非线性方法比线性反演有一个巨大的优势:它可以求解非线性方程组。就您而言,这意味着您不必C事先知道。

Just to give an example, let's solve for y = A * exp(K * t) with some noisy data using both linear and nonlinear methods:

举个例子,让我们使用线性和非线性方法用一些噪声数据求解 y = A * exp(K * t):

import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import scipy.optimize


def main():
    # Actual parameters
    A0, K0, C0 = 2.5, -4.0, 2.0

    # Generate some data based on these
    tmin, tmax = 0, 0.5
    num = 20
    t = np.linspace(tmin, tmax, num)
    y = model_func(t, A0, K0, C0)

    # Add noise
    noisy_y = y + 0.5 * (np.random.random(num) - 0.5)

    fig = plt.figure()
    ax1 = fig.add_subplot(2,1,1)
    ax2 = fig.add_subplot(2,1,2)

    # Non-linear Fit
    A, K, C = fit_exp_nonlinear(t, noisy_y)
    fit_y = model_func(t, A, K, C)
    plot(ax1, t, y, noisy_y, fit_y, (A0, K0, C0), (A, K, C0))
    ax1.set_title('Non-linear Fit')

    # Linear Fit (Note that we have to provide the y-offset ("C") value!!
    A, K = fit_exp_linear(t, y, C0)
    fit_y = model_func(t, A, K, C0)
    plot(ax2, t, y, noisy_y, fit_y, (A0, K0, C0), (A, K, 0))
    ax2.set_title('Linear Fit')

    plt.show()

def model_func(t, A, K, C):
    return A * np.exp(K * t) + C

def fit_exp_linear(t, y, C=0):
    y = y - C
    y = np.log(y)
    K, A_log = np.polyfit(t, y, 1)
    A = np.exp(A_log)
    return A, K

def fit_exp_nonlinear(t, y):
    opt_parms, parm_cov = sp.optimize.curve_fit(model_func, t, y, maxfev=1000)
    A, K, C = opt_parms
    return A, K, C

def plot(ax, t, y, noisy_y, fit_y, orig_parms, fit_parms):
    A0, K0, C0 = orig_parms
    A, K, C = fit_parms

    ax.plot(t, y, 'k--', 
      label='Actual Function:\n $y = %0.2f e^{%0.2f t} + %0.2f$' % (A0, K0, C0))
    ax.plot(t, fit_y, 'b-',
      label='Fitted Function:\n $y = %0.2f e^{%0.2f t} + %0.2f$' % (A, K, C))
    ax.plot(t, noisy_y, 'ro')
    ax.legend(bbox_to_anchor=(1.05, 1.1), fancybox=True, shadow=True)

if __name__ == '__main__':
    main()

Fitting exp

拟合经验

Note that the linear solution provides a result much closer to the actual values. However, we have to provide the y-offset value in order to use a linear solution. The non-linear solution doesn't require this a-priori knowledge.

请注意,线性解决方案提供的结果更接近实际值。但是,我们必须提供 y 偏移值才能使用线性解决方案。非线性解决方案不需要这种先验知识。

回答by Justin Peel

I would use the scipy.optimize.curve_fitfunction. The doc string for it even has an example of fitting an exponential decay in it which I'll copy here:

我会使用该scipy.optimize.curve_fit功能。它的文档字符串甚至有一个拟合指数衰减的示例,我将在此处复制:

>>> import numpy as np
>>> from scipy.optimize import curve_fit
>>> def func(x, a, b, c):
...     return a*np.exp(-b*x) + c

>>> x = np.linspace(0,4,50)
>>> y = func(x, 2.5, 1.3, 0.5)
>>> yn = y + 0.2*np.random.normal(size=len(x))

>>> popt, pcov = curve_fit(func, x, yn)

The fitted parameters will vary because of the random noise added in, but I got 2.47990495, 1.40709306, 0.53753635 as a, b, and c so that's not so bad with the noise in there. If I fit to y instead of yn I get the exact a, b, and c values.

由于添加了随机噪声,拟合参数会有所不同,但我得到了 2.47990495、1.40709306、0.53753635 作为 a、b 和 c,因此那里的噪声还不错。如果我适合 y 而不是 yn 我会得到确切的 a、b 和 c 值。

回答by Marcus P S

The right way to do it is to do Prony estimation and use the result as the initial guess for least squares fitting (or some other more robust fitting routine). Prony estimation does not need an initial guess, but it does need many points to yield a good a estimate.

正确的方法是进行 Prony 估计并将结果用作最小二乘拟合(或其他一些更强大的拟合例程)的初始猜测。Prony 估计不需要初始猜测,但它确实需要很多点才能产生一个好的估计。

Here is an overview

这是一个概述

http://www.statsci.org/other/prony.html

http://www.statsci.org/other/prony.html

In Octave this is implemented as expfit, so you can write your own routine based on the Octave library function.

在 Octave 中,这是作为 实现的expfit,因此您可以根据 Octave 库函数编写自己的例程。

Prony estimation does need the offset to be known, but if you go "far enough" into your decay, you have a reasonable estimate of the offset, so you can just shift the data to place the offset at 0. At any rate, Prony estimation is just a way to get a reasonable initial guess for other fitting routines.

Prony 估计确实需要知道偏移量,但是如果你的衰减“足够远”,你就有了一个合理的偏移量估计,所以你可以移动数据将偏移量置于 0。无论如何,Prony估计只是为其他拟合例程获得合理初始猜测的一种方式。

回答by Elendurwen

I never got curve_fit to work properly, as you say I don't want to guess anything. I was trying to simplify Joe Kington's example and this is what I got working. The idea is to translate the 'noisy' data into log and then transalte it back and use polyfit and polyval to figure out the parameters:

我从来没有让 curve_fit 正常工作,正如你所说,我不想猜测任何事情。我试图简化乔金顿的例子,这就是我的工作。这个想法是将“嘈杂”数据转换为日志,然后将其转换回来并使用 polyfit 和 polyval 来计算参数:

model = np.polyfit(xVals, np.log(yVals) , 1);   
splineYs = np.exp(np.polyval(model,xVals[0]));
pyplot.plot(xVals,yVals,','); #show scatter plot of original data
pyplot.plot(xVals,splineYs('b-'); #show fitted line
pyplot.show()

where xVals and yVals are just lists.

其中 xVals 和 yVals 只是列表。

回答by Louis Strous

I don't know python, but I do know a simple way to non-iteratively estimate the coefficients of exponential decay with an offset, given three data points with a fixed difference in their independent coordinate. Your data points have a fixed difference in their independent coordinate (your x values are spaced at an interval of 60), so my method can be applied to them. You can surely translate the math into python.

我不知道 python,但我知道一种简单的方法来非迭代地估计具有偏移的指数衰减系数,给定三个数据点,它们的独立坐标具有固定差异。您的数据点的独立坐标具有固定差异(您的 x 值的间隔为 60),因此我的方法可以应用于它们。您当然可以将数学转换为 python。

Assume

认为

y = A + B*exp(-c*x) = A + B*C^x

where C = exp(-c)

在哪里 C = exp(-c)

Given y_0, y_1, y_2, for x = 0, 1, 2, we solve

给定 y_0, y_1, y_2, 对于 x = 0, 1, 2, 我们求解

y_0 = A + B
y_1 = A + B*C
y_2 = A + B*C^2

to find A, B, C as follows:

查找 A、B、C 如下:

A = (y_0*y_2 - y_1^2)/(y_0 + y_2 - 2*y_1)
B = (y_1 - y_0)^2/(y_0 + y_2 - 2*y_1)
C = (y_2 - y_1)/(y_1 - y_0)

The corresponding exponential passes exactly through the three points (0,y_0), (1,y_1), and (2,y_2). If your data points are not at x coordinates 0, 1, 2 but rather at k, k + s, and k + 2*s, then

对应的指数正好通过三个点 (0,y_0)、(1,y_1) 和 (2,y_2)。如果您的数据点不在 x 坐标 0、1、2 处,而是在 k、k + s 和 k + 2*s,则

y = A′ + B′*C′^(k + s*x) = A′ + B′*C′^k*(C′^s)^x = A + B*C^x

so you can use the above formulas to find A, B, C and then calculate

所以你可以使用上面的公式找到A,B,C然后计算

A′ = A
C′ = C^(1/s)
B′ = B/(C′^k)

The resulting coefficients are very sensitive to errors in the y coordinates, which can lead to large errors if you extrapolate beyond the range defined by the three used data points, so it is best to calculate A, B, C from three data points that are as far apart as possible (while still having a fixed distance between them).

所得系数对 y 坐标中的误差非常敏感,如果外推超出三个所用数据点定义的范围,可能会导致大误差,因此最好从三个数据点计算 A、B、C尽可能远(同时它们之间仍然有固定的距离)。

Your data set has 10 equidistant data points. Let's pick the three data points (110, 2391), (350, 786), (590, 263) for use ― these have the greatest possible fixed distance (240) in the independent coordinate. So, y_0 = 2391, y_1 = 786, y_2 = 263, k = 110, s = 240. Then A = 10.20055, B = 2380.799, C = 0.3258567, A′ = 10.20055, B′ = 3980.329, C′ = 0.9953388. The exponential is

您的数据集有 10 个等距数据点。让我们选择三个数据点 (110, 2391), (350, 786), (590, 263) 使用——它们在独立坐标中具有最大可能的固定距离 (240)。所以,y_0 = 2391, y_1 = 786, y_2 = 263, k = 110, s = 240. 那么 A = 10.20055, B = 2380.799, C = 0.3258567, A' = 10.20.3'0.3, A' = 10.20.3.3 指数是

y = 10.20055 + 3980.329*0.9953388^x = 10.20055 + 3980.329*exp(-0.004672073*x)

You can use this exponential as the initial guess in a non-linear fitting algorithm.

您可以将此指数用作非线性拟合算法中的初始猜测。

The formula for calculating A is the same as that used by the Shanks transformation (http://en.wikipedia.org/wiki/Shanks_transformation).

计算 A 的公式与香克斯变换 ( http://en.wikipedia.org/wiki/Shanks_transformation)使用的公式相同。

回答by JJacquelin

Procedure to fit exponential with no initial guessing not iterative process :

无需初始猜测而不是迭代过程即可拟合指数的程序:

enter image description here

在此处输入图片说明

This comes from the paper (pp.16-17) : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales

这来自论文(第 16-17 页):https: //fr.scribd.com/doc/14674814/Regressions-et-equations-integrales

If necessary, this can be used to initialise a non-linear regression calculus in order chose a specific criteria of optimisation.

如有必要,这可用于初始化非线性回归演算,以便选择特定的优化标准。

EXAMPLE :

例子 :

The example given by Joe Kington is interesting. Unfortunately the Data isn't shown, only the graph. So, the data (x,y) below comes from a graphical scan of the graph and as a consequence the numerical values are probably not exactly those used by Joe Kington. Nevertheless, the respective equations of the "fitted" curves are very close one to the other, considering the wide scatter of the points.

Joe Kington 给出的例子很有趣。不幸的是,数据没有显示,只有图表。因此,下面的数据 (x,y) 来自图形的图形扫描,因此数值可能不完全是 Joe Kington 使用的数值。然而,考虑到点的广泛分散,“拟合”曲线的各个方程彼此非常接近。

enter image description here

在此处输入图片说明

The upper Figure is the copy of the Kington's graph.

上图是金顿图的副本。

The lower Figure shows the results obtained with the procedure presented above.

下图显示了使用上述程序获得的结果。

回答by Max

If your decay starts not from 0 use:

如果您的衰减不是从 0 开始,请使用:

popt, pcov = curve_fit(self.func, x-x0, y)

where x0 the start of decay (where you want to start the fit). And then again use x0 for plotting:

其中 x0 衰减的开始(您要开始拟合的位置)。然后再次使用 x0 进行绘图:

plt.plot(x, self.func(x-x0, *popt),'--r', label='Fit')

where the function is:

其中函数是:

    def func(self, x, a, tau, c):
        return a * np.exp(-x/tau) + c

回答by FriendToGeoff

Python implementation of @JJacquelin's solution. I needed an approximate non-solve based solution with no initial guesses so @JJacquelin's answer was really helpful. The original question was posed as a python numpy/scipy request. I took @johanvdw's nice clean R code and refactored it as python/numpy. Hopefully useful to someone: https://gist.github.com/friendtogeoff/00b89fa8d9acc1b2bdf3bdb675178a29

@JJacquelin 解决方案的 Python 实现。我需要一个没有初始猜测的近似非解决方案,所以@JJacquelin 的回答真的很有帮助。最初的问题是作为 python numpy/scipy 请求提出的。我把@johanvdw 的漂亮干净的 R 代码重构为 python/numpy。希望对某人有用:https: //gist.github.com/friendtogeoff/00b89fa8d9acc1b2bdf3bdb675178a29

import numpy as np

"""
compute an exponential decay fit to two vectors of x and y data
result is in form y = a + b * exp(c*x).
ref. https://gist.github.com/johanvdw/443a820a7f4ffa7e9f8997481d7ca8b3
"""
def exp_est(x,y):
    n = np.size(x)
    # sort the data into ascending x order
    y = y[np.argsort(x)]
    x = x[np.argsort(x)]

    Sk = np.zeros(n)

    for n in range(1,n):
        Sk[n] = Sk[n-1] + (y[n] + y[n-1])*(x[n]-x[n-1])/2
    dx = x - x[0]
    dy = y - y[0]

    m1 = np.matrix([[np.sum(dx**2), np.sum(dx*Sk)],
                    [np.sum(dx*Sk), np.sum(Sk**2)]])
    m2 = np.matrix([np.sum(dx*dy), np.sum(dy*Sk)])

    [d, c] = (m1.I * m2.T).flat

    m3 = np.matrix([[n,                  np.sum(np.exp(  c*x))],
                    [np.sum(np.exp(c*x)),np.sum(np.exp(2*c*x))]])

    m4 = np.matrix([np.sum(y), np.sum(y*np.exp(c*x).T)])

    [a, b] = (m3.I * m4.T).flat

    return [a,b,c]