如何在 Python 中进行指数和对数曲线拟合？我发现只有多项式拟合

Question

提问by Tomas Novotny

I have a set of data and I want to compare which line describes it best (polynomials of different orders, exponential or logarithmic).

我有一组数据，我想比较哪条线最能描述它（不同阶的多项式，指数或对数）。

I use Python and Numpy and for polynomial fitting there is a function polyfit(). But I found no such functions for exponential and logarithmic fitting.

我使用 Python 和 Numpy，对于多项式拟合，有一个函数polyfit(). 但是我发现没有这样的指数和对数拟合函数。

Are there any? Or how to solve it otherwise?

有吗？或者如何解决？

Answer 1

采纳答案by kennytm

For fitting y= A+ Blog x, just fit yagainst (log x).

对于拟合y= A+ Blog x，只需将y与 (log x) 进行拟合。

>>> x = numpy.array([1, 7, 20, 50, 79])
>>> y = numpy.array([10, 19, 30, 35, 51])
>>> numpy.polyfit(numpy.log(x), y, 1)
array([ 8.46295607,  6.61867463])
# y ≈ 8.46 log(x) + 6.62

For fitting y= Ae^Bx, take the logarithm of both side gives log y= log A+ Bx. So fit (log y) against x.

为了拟合y= Ae ^Bx，取两边的对数得到 log y= log A+ Bx。所以适合 (log y) 对x。

Note that fitting (log y) as if it is linear will emphasize small values of y, causing large deviation for large y. This is because polyfit(linear regression) works by minimizing ∑_i(ΔY)²= ∑_i(Y_i− ?_i)². When Y_i= log y_i, the residues ΔY_i= Δ(log y_i) ≈ Δy_i/ |y_i|. So even if polyfitmakes a very bad decision for large y, the "divide-by-|y|" factor will compensate for it, causing polyfitfavors small values.

需要注意的是配件（日志Ÿ），就好像它是线性的会强调的较小值Ÿ，造成较大偏差大ÿ。这是因为polyfit（线性回归）通过最小化 ∑ _i(Δ Y) ²= ∑ _i( Y _i− ? _i) ²起作用。当Y _i= log y _{i 时}，残差Δ Y _i= Δ(log y _i) ≈ Δ y _i/ | 你_我|。所以即使polyfit对大y做出了一个非常糟糕的决定，即“除以-| y|” factor 将补偿它，导致polyfit偏爱小值。

This could be alleviated by giving each entry a "weight" proportional to y. polyfitsupports weighted-least-squares via the wkeyword argument.

这可以通过给每个条目一个与y成比例的“权重”来缓解。polyfit通过w关键字参数支持加权最小二乘法。

>>> x = numpy.array([10, 19, 30, 35, 51])
>>> y = numpy.array([1, 7, 20, 50, 79])
>>> numpy.polyfit(x, numpy.log(y), 1)
array([ 0.10502711, -0.40116352])
#    y ≈ exp(-0.401) * exp(0.105 * x) = 0.670 * exp(0.105 * x)
# (^ biased towards small values)
>>> numpy.polyfit(x, numpy.log(y), 1, w=numpy.sqrt(y))
array([ 0.06009446,  1.41648096])
#    y ≈ exp(1.42) * exp(0.0601 * x) = 4.12 * exp(0.0601 * x)
# (^ not so biased)

Note that Excel, LibreOffice and most scientific calculators typically use the unweighted (biased) formula for the exponential regression / trend lines.If you want your results to be compatible with these platforms, do not include the weights even if it provides better results.

请注意，Excel、LibreOffice 和大多数科学计算器通常对指数回归/趋势线使用未加权（有偏）的公式。如果您希望您的结果与这些平台兼容，请不要包含权重，即使它提供了更好的结果。

Now, if you can use scipy, you could use scipy.optimize.curve_fitto fit any model without transformations.

现在，如果您可以使用 scipy，您就可以scipy.optimize.curve_fit在不进行转换的情况下拟合任何模型。

For y= A+ Blog xthe result is the same as the transformation method:

对于y= A+ Blog x，结果与转换方法相同：

>>> x = numpy.array([1, 7, 20, 50, 79])
>>> y = numpy.array([10, 19, 30, 35, 51])
>>> scipy.optimize.curve_fit(lambda t,a,b: a+b*numpy.log(t),  x,  y)
(array([ 6.61867467,  8.46295606]), 
 array([[ 28.15948002,  -7.89609542],
        [ -7.89609542,   2.9857172 ]]))
# y ≈ 6.62 + 8.46 log(x)

For y= Ae^Bx, however, we can get a better fit since it computes Δ(log y) directly. But we need to provide an initialize guess so curve_fitcan reach the desired local minimum.

然而，对于y= Ae ^Bx，我们可以得到更好的拟合，因为它直接计算 Δ(log y)。但是我们需要提供一个初始化猜测，这样curve_fit才能达到所需的局部最小值。

>>> x = numpy.array([10, 19, 30, 35, 51])
>>> y = numpy.array([1, 7, 20, 50, 79])
>>> scipy.optimize.curve_fit(lambda t,a,b: a*numpy.exp(b*t),  x,  y)
(array([  5.60728326e-21,   9.99993501e-01]),
 array([[  4.14809412e-27,  -1.45078961e-08],
        [ -1.45078961e-08,   5.07411462e+10]]))
# oops, definitely wrong.
>>> scipy.optimize.curve_fit(lambda t,a,b: a*numpy.exp(b*t),  x,  y,  p0=(4, 0.1))
(array([ 4.88003249,  0.05531256]),
 array([[  1.01261314e+01,  -4.31940132e-02],
        [ -4.31940132e-02,   1.91188656e-04]]))
# y ≈ 4.88 exp(0.0553 x). much better.

Answer 2

回答by IanVS

You can also fit a set of a data to whatever function you like using curve_fitfrom scipy.optimize. For example if you want to fit an exponential function (from the documentation):

您还可以使用curve_fitfrom将一组数据拟合到您喜欢的任何函数scipy.optimize。例如，如果您想拟合指数函数（来自文档）：

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

def func(x, a, b, c):
    return a * np.exp(-b * x) + c

x = np.linspace(0,4,50)
y = func(x, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x))

popt, pcov = curve_fit(func, x, yn)

And then if you want to plot, you could do:

然后如果你想绘图，你可以这样做：

plt.figure()
plt.plot(x, yn, 'ko', label="Original Noised Data")
plt.plot(x, func(x, *popt), 'r-', label="Fitted Curve")
plt.legend()
plt.show()

(Note: the *in front of poptwhen you plot will expand out the terms into the a, b, and cthat funcis expecting.)

（注：*在前面popt，当你将绘制出扩大的条款进入a，b和c那个func。期待）

Answer 3

回答by Leandro

I was having some trouble with this so let me be very explicit so noobs like me can understand.

我在这方面遇到了一些麻烦，所以让我说得很清楚，这样像我这样的菜鸟才能理解。

Lets say that we have a data file or something like that

假设我们有一个数据文件或类似的东西

# -*- coding: utf-8 -*-

import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
import sympy as sym

"""
Generate some data, let's imagine that you already have this. 
"""
x = np.linspace(0, 3, 50)
y = np.exp(x)

"""
Plot your data
"""
plt.plot(x, y, 'ro',label="Original Data")

"""
brutal force to avoid errors
"""    
x = np.array(x, dtype=float) #transform your data in a numpy array of floats 
y = np.array(y, dtype=float) #so the curve_fit can work

"""
create a function to fit with your data. a, b, c and d are the coefficients
that curve_fit will calculate for you. 
In this part you need to guess and/or use mathematical knowledge to find
a function that resembles your data
"""
def func(x, a, b, c, d):
    return a*x**3 + b*x**2 +c*x + d

"""
make the curve_fit
"""
popt, pcov = curve_fit(func, x, y)

"""
The result is:
popt[0] = a , popt[1] = b, popt[2] = c and popt[3] = d of the function,
so f(x) = popt[0]*x**3 + popt[1]*x**2 + popt[2]*x + popt[3].
"""
print "a = %s , b = %s, c = %s, d = %s" % (popt[0], popt[1], popt[2], popt[3])

"""
Use sympy to generate the latex sintax of the function
"""
xs = sym.Symbol('\lambda')    
tex = sym.latex(func(xs,*popt)).replace('$', '')
plt.title(r'$f(\lambda)= %s$' %(tex),fontsize=16)

"""
Print the coefficients and plot the funcion.
"""

plt.plot(x, func(x, *popt), label="Fitted Curve") #same as line above \/
#plt.plot(x, popt[0]*x**3 + popt[1]*x**2 + popt[2]*x + popt[3], label="Fitted Curve") 

plt.legend(loc='upper left')
plt.show()

the result is: a = 0.849195983017 , b = -1.18101681765, c = 2.24061176543, d = 0.816643894816

结果是：a = 0.849195983017，b = -1.18101681765，c = 2.24061176543，d = 0.816643894816

Raw data and fitted function

原始数据和拟合函数

Answer 4

回答by murphy1310

Well I guess you can always use:

好吧，我想你总是可以使用：

np.log   -->  natural log
np.log10 -->  base 10
np.log2  -->  base 2

Slightly modifying IanVS's answer:

稍微修改IanVS 的回答：

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

def func(x, a, b, c):
  #return a * np.exp(-b * x) + c
  return a * np.log(b * x) + c

x = np.linspace(1,5,50)   # changed boundary conditions to avoid division by 0
y = func(x, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x))

popt, pcov = curve_fit(func, x, yn)

plt.figure()
plt.plot(x, yn, 'ko', label="Original Noised Data")
plt.plot(x, func(x, *popt), 'r-', label="Fitted Curve")
plt.legend()
plt.show()

This results in the following graph:

这导致以下图表：

Answer 5

回答by pylang

Here's a linearizationoption on simple data that uses tools from scikit learn.

这是使用scikit learn工具的简单数据的线性化选项。

Given

给定的

import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import FunctionTransformer


np.random.seed(123)

# General Functions
def func_exp(x, a, b, c):
    """Return values from a general exponential function."""
    return a * np.exp(b * x) + c


def func_log(x, a, b, c):
    """Return values from a general log function."""
    return a * np.log(b * x) + c


# Helper
def generate_data(func, *args, jitter=0):
    """Return a tuple of arrays with random data along a general function."""
    xs = np.linspace(1, 5, 50)
    ys = func(xs, *args)
    noise = jitter * np.random.normal(size=len(xs)) + jitter
    xs = xs.reshape(-1, 1)                                  # xs[:, np.newaxis]
    ys = (ys + noise).reshape(-1, 1)
    return xs, ys

transformer = FunctionTransformer(np.log, validate=True)

Code

代码

Fit exponential data

拟合指数数据

# Data
x_samp, y_samp = generate_data(func_exp, 2.5, 1.2, 0.7, jitter=3)
y_trans = transformer.fit_transform(y_samp)             # 1

# Regression
regressor = LinearRegression()
results = regressor.fit(x_samp, y_trans)                # 2
model = results.predict
y_fit = model(x_samp)

# Visualization
plt.scatter(x_samp, y_samp)
plt.plot(x_samp, np.exp(y_fit), "k--", label="Fit")     # 3
plt.title("Exponential Fit")

Fit log data

拟合日志数据

# Data
x_samp, y_samp = generate_data(func_log, 2.5, 1.2, 0.7, jitter=0.15)
x_trans = transformer.fit_transform(x_samp)             # 1

# Regression
regressor = LinearRegression()
results = regressor.fit(x_trans, y_samp)                # 2
model = results.predict
y_fit = model(x_trans)

# Visualization
plt.scatter(x_samp, y_samp)
plt.plot(x_samp, y_fit, "k--", label="Fit")             # 3
plt.title("Logarithmic Fit")

Details

细节

General Steps

一般步骤

Apply a log operation to data values (x, yor both)
Regress the data to a linearized model
Plot by "reversing" any log operations (with np.exp()) and fit to original data

将日志操作应用于数据值（x，y或两者）
将数据回归到线性化模型
通过“反转”任何日志操作（使用np.exp()）来绘制并适合原始数据

Assuming our data follows an exponential trend, a general equation⁺may be:

假设我们的数据遵循指数趋势，一般方程⁺可能是：

We can linearize the latter equation (e.g. y = intercept + slope * x) by taking the log:

我们可以通过取对数来线性化后一个方程（例如 y = 截距 + 斜率 * x）：

Given a linearized equation⁺⁺and the regression parameters, we could calculate:

给定一个线性方程⁺⁺和回归参数，我们可以计算：

Avia intercept (ln(A))
Bvia slope (B)

A通过拦截 ( ln(A))
B通过斜率 ( B)

Summary of Linearization Techniques

线性化技术总结

Relationship |  Example   |     General Eqn.     |  Altered Var.  |        Linearized Eqn.  
-------------|------------|----------------------|----------------|------------------------------------------
Linear       | x          | y =     B * x    + C | -              |        y =   C    + B * x
Logarithmic  | log(x)     | y = A * log(B*x) + C | log(x)         |        y =   C    + A * (log(B) + log(x))
Exponential  | 2**x, e**x | y = A * exp(B*x) + C | log(y)         | log(y-C) = log(A) + B * x
Power        | x**2       | y =     B * x**N + C | log(x), log(y) | log(y-C) = log(B) + N * log(x)

_{⁺Note: linearizing exponential functions works best when the noise is small and C=0. Use with caution.}

_{⁺注意：当噪声较小且 C=0 时，线性化指数函数效果最佳。谨慎使用。}

_{⁺⁺Note: while altering x data helps linearize exponentialdata, altering y data helps linearize logdata.}

_{⁺⁺注意：虽然改变 x 数据有助于线性化指数数据，但改变 y 数据有助于线性化日志数据。}

Answer 6

回答by pylang

We demonstrate features of lmfitwhile solving both problems.

我们展示了lmfit同时解决这两个问题的特征。

Given

给定的

import lmfit

import numpy as np

import matplotlib.pyplot as plt


%matplotlib inline
np.random.seed(123)

# General Functions
def func_log(x, a, b, c):
    """Return values from a general log function."""
    return a * np.log(b * x) + c


# Data
x_samp = np.linspace(1, 5, 50)
_noise = np.random.normal(size=len(x_samp), scale=0.06)
y_samp = 2.5 * np.exp(1.2 * x_samp) + 0.7 + _noise
y_samp2 = 2.5 * np.log(1.2 * x_samp) + 0.7 + _noise

Code

代码

Approach 1 - lmfitModel

方法 1 -lmfit模型

Fit exponential data

拟合指数数据

regressor = lmfit.models.ExponentialModel()                # 1    
initial_guess = dict(amplitude=1, decay=-1)                # 2
results = regressor.fit(y_samp, x=x_samp, **initial_guess)
y_fit = results.best_fit    

plt.plot(x_samp, y_samp, "o", label="Data")
plt.plot(x_samp, y_fit, "k--", label="Fit")
plt.legend()

Approach 2 - Custom Model

方法 2 - 自定义模型

Fit log data

拟合日志数据

regressor = lmfit.Model(func_log)                          # 1
initial_guess = dict(a=1, b=.1, c=.1)                      # 2
results = regressor.fit(y_samp2, x=x_samp, **initial_guess)
y_fit = results.best_fit

plt.plot(x_samp, y_samp2, "o", label="Data")
plt.plot(x_samp, y_fit, "k--", label="Fit")
plt.legend()

Details

细节

Choose a regression class
Supply named, initial guesses that respect the function's domain

选择回归类
提供命名的、尊重函数域的初始猜测

You can determine the inferred parameters from the regressor object. Example:

您可以从回归器对象中确定推断的参数。例子：

regressor.param_names
# ['decay', 'amplitude']

Note: the ExponentialModel()follows a decay function, which accepts two parameters, one of which is negative.

注意：ExponentialModel()下面是一个衰减函数，它接受两个参数，其中一个为负。

See also ExponentialGaussianModel(), which accepts more parameters.

另请参阅ExponentialGaussianModel()，它接受更多参数。

Installthe library via > pip install lmfit.

安装通过库> pip install lmfit。

Answer 7

回答by Ben

Wolfram has a closed form solution for fitting an exponential. They also have similar solutions for fitting a logarithmicand power law.

Wolfram 有一个封闭形式的解决方案来拟合指数。他们也有类似的解决方案来拟合对数和幂律。

I found this to work better than scipy's curve_fit. Here is an example:

我发现这比 scipy 的 curve_fit 效果更好。下面是一个例子：

import numpy as np
import matplotlib.pyplot as plt

# Fit the function y = A * exp(B * x) to the data
# returns (A, B)
# From: https://mathworld.wolfram.com/LeastSquaresFittingExponential.html
def fit_exp(xs, ys):
    S_x2_y = 0.0
    S_y_lny = 0.0
    S_x_y = 0.0
    S_x_y_lny = 0.0
    S_y = 0.0
    for (x,y) in zip(xs, ys):
        S_x2_y += x * x * y
        S_y_lny += y * np.log(y)
        S_x_y += x * y
        S_x_y_lny += x * y * np.log(y)
        S_y += y
    #end
    a = (S_x2_y * S_y_lny - S_x_y * S_x_y_lny) / (S_y * S_x2_y - S_x_y * S_x_y)
    b = (S_y * S_x_y_lny - S_x_y * S_y_lny) / (S_y * S_x2_y - S_x_y * S_x_y)
    return (np.exp(a), b)


xs = [33, 34, 35, 36, 37, 38, 39, 40, 41, 42]
ys = [3187, 3545, 4045, 4447, 4872, 5660, 5983, 6254, 6681, 7206]

(A, B) = fit_exp(xs, ys)

plt.figure()
plt.plot(xs, ys, 'o-', label='Raw Data')
plt.plot(xs, [A * np.exp(B *x) for x in xs], 'o-', label='Fit')

plt.title('Exponential Fit Test')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(loc='best')
plt.tight_layout()
plt.show()

如何在 Python 中进行指数和对数曲线拟合？我发现只有多项式拟合

提问by Tomas Novotny

采纳答案by kennytm

回答by IanVS

回答by Leandro

回答by murphy1310

回答by pylang

回答by pylang

回答by Ben

相关推荐

最近更新

标签

如何在 Python 中进行指数和对数曲线拟合？我发现只有多项式拟合

提问by Tomas Novotny

采纳答案by kennytm

回答by IanVS

回答by Leandro

回答by murphy1310

回答by pylang

回答by pylang

回答by Ben

相关推荐

使用 Python 估计自相关

Python 为什么 corrcoef 返回一个矩阵？

Python 如何使用 lxml 通过文本查找元素？

Python 如何按数字对字符串列表进行排序？

相关推荐

最近更新

标签