在python中求解非线性方程

Question

提问by user1171835

I have 4 non-linear equations with three unknowns X, Y, and Zthat I want to solve for. The equations are of the form:

我有 4 个非线性方程，其中包含三个未知数X, Y, 和Z我想求解。方程的形式为：

F(m) = X^2 + a(m)Y^2 + b(m)XYcosZ + c(m)XYsinZ

...where a, band care constants which are dependent on each value of Fin the four equations.

... where a, bandc是依赖于F四个方程中每个值的常数。

What is the best way to go about solving this?

解决这个问题的最佳方法是什么？

Answer 1

采纳答案by Joe Kington

There are two ways to do this.

有两种方法可以做到这一点。

Use a non-linear solver
Linearize the problem and solve it in the least-squares sense

使用非线性求解器
将问题线性化并以最小二乘法求解

Setup

设置

So, as I understand your question, you know F, a, b, and c at 4 different points, and you want to invert for the model parameters X, Y, and Z. We have 3 unknowns and 4 observed data points, so the problem is overdetermined. Therefore, we'll be solving in the least-squares sense.

所以，正如我理解你的问题，你知道 F、a、b 和 c 在 4 个不同的点，并且你想反转模型参数 X、Y 和 Z。我们有 3 个未知数和 4 个观察到的数据点，所以问题被过度确定了。因此，我们将在最小二乘意义上进行求解。

It's more common to use the opposite terminology in this case, so let's flip your equation around. Instead of:

在这种情况下使用相反的术语更为常见，所以让我们颠倒一下你的等式。代替：

F_i = X^2 + a_i Y^2 + b_i X Y cosZ + c_i X Y sinZ

Let's write:

让我们写：

F_i = a^2 + X_i b^2 + Y_i a b cos(c) + Z_i a b sin(c)

Where we know F, X, Y, and Zat 4 different points (e.g. F_0, F_1, ... F_i).

我们在哪里知道F、X、Y和Z4 个不同的点（例如F_0, F_1, ... F_i）。

We're just changing the names of the variables, not the equation itself. (This is more for my ease of thinking than anything else.)

我们只是改变了变量的名称，而不是方程本身。（这比其他任何事情都更便于我思考。）

Linear Solution

线性解决方案

It's actually possible to linearize this equation. You can easily solve for a^2, b^2, a b cos(c), and a b sin(c). To make this a bit easier, let's relabel things yet again:

实际上可以将这个方程线性化。您可以轻松地解决a^2，b^2，a b cos(c)，和a b sin(c)。为了让这更容易一些，让我们再次重新标记：

d = a^2
e = b^2
f = a b cos(c)
g = a b sin(c)

Now the equation is a lot simpler: F_i = d + e X_i + f Y_i + g Z_i. It's easy to do a least-squares linear inversion for d, e, f, and g. We can then get a, b, and cfrom:

现在等式要简单得多：F_i = d + e X_i + f Y_i + g Z_i. 这很容易做到最小二乘线性反演d，e，f，和g。然后，我们可以得到a，b以及c来自：

a = sqrt(d)
b = sqrt(e)
c = arctan(g/f)

Okay, let's write this up in matrix form. We're going to translate 4 observations of (the code we'll write will take any number of observations, but let's keep it concrete at the moment):

好的，让我们把它写成矩阵形式。我们将翻译 4 个观察结果（我们将编写的代码将接受任意数量的观察结果，但现在让我们保持具体）：

F_i = d + e X_i + f Y_i + g Z_i

Into:

进入：

|F_0|   |1, X_0, Y_0, Z_0|   |d|
|F_1| = |1, X_1, Y_1, Z_1| * |e|
|F_2|   |1, X_2, Y_2, Z_2|   |f|
|F_3|   |1, X_3, Y_3, Z_3|   |g|

Or: F = G * m(I'm a geophysist, so we use Gfor "Green's Functions" and mfor "Model Parameters". Usually we'd use dfor "data" instead of F, as well.)

或者：（F = G * m我是地球物理学家，所以我们使用G“格林函数”和m“模型参数”。通常我们也会使用d“数据”而不是F。）

In python, this would translate to:

在python中，这将转化为：

def invert(f, x, y, z):
    G = np.vstack([np.ones_like(x), x, y, z]).T
    m, _, _, _ = np.linalg.lstsq(G, f)

    d, e, f, g = m
    a = np.sqrt(d)
    b = np.sqrt(e)
    c = np.arctan2(g, f) # Note that `c` will be in radians, not degrees
    return a, b, c

Non-linear Solution

非线性解决方案

You could also solve this using scipy.optimize, as @Joe suggested. The most accessible function in scipy.optimizeis scipy.optimize.curve_fitwhich uses a Levenberg-Marquardt method by default.

scipy.optimize正如@Joe 建议的那样，您也可以使用解决此问题。最容易访问的函数scipy.optimize是scipy.optimize.curve_fit默认使用 Levenberg-Marquardt 方法。

Levenberg-Marquardt is a "hill climbing" algorithm (well, it goes downhill, in this case, but the term is used anyway). In a sense, you make an initial guess of the model parameters (all ones, by default in scipy.optimize) and follow the slope of observed - predictedin your parameter space downhill to the bottom.

Levenberg-Marquardt 是一种“爬山”算法（好吧，在这种情况下它会走下坡路，但无论如何都会使用该术语）。从某种意义上说，你做的模型参数（所有的，默认情况下，初始的猜测scipy.optimize），并遵循的斜率observed - predicted在参数空间下坡底部。

Caveat:Picking the right non-linear inversion method, initial guess, and tuning the parameters of the method is very much a "dark art". You only learn it by doing it, and there are a lot of situations where things won't work properly. Levenberg-Marquardt is a good general method if your parameter space is fairly smooth (this one should be). There are a lot of others (including genetic algorithms, neural nets, etc in addition to more common methods like simulated annealing) that are better in other situations. I'm not going to delve into that part here.

警告：选择正确的非线性反演方法、初始猜测和调整方法的参数在很大程度上是一门“黑暗艺术”。你只能通过实践来学习它，并且有很多情况无法正常工作。如果您的参数空间相当平滑（这应该是），Levenberg-Marquardt 是一种很好的通用方法。还有很多其他方法（包括遗传算法、神经网络等，以及模拟退火等更常见的方法）在其他情况下更好。我不打算在这里深入研究那部分。

There is one common gotcha that some optimization toolkits try to correct for that scipy.optimizedoesn't try to handle. If your model parameters have different magnitudes (e.g. a=1, b=1000, c=1e-8), you'll need to rescale things so that they're similar in magnitude. Otherwise scipy.optimize's "hill climbing" algorithms (like LM) won't accurately calculate the estimate the local gradient, and will give wildly inaccurate results. For now, I'm assuming that a, b, and chave relatively similar magnitudes. Also, be aware that essentially all non-linear methods require you to make an initial guess, and are sensitive to that guess. I'm leaving it out below (just pass it in as the p0kwarg to curve_fit) because the default a, b, c = 1, 1, 1is a fairly accurate guess for a, b, c = 3, 2, 1.

有一个常见的问题，一些优化工具包试图纠正它scipy.optimize并没有尝试处理。如果您的模型参数具有不同的幅度（例如a=1, b=1000, c=1e-8），您将需要重新调整事物的比例，使它们的幅度相似。否则scipy.optimize的“爬山”算法（如 LM）将无法准确计算局部梯度的估计值，并且会给出非常不准确的结果。现在，我假设a，b以及c具有相对类似的量值。另外，请注意，基本上所有非线性方法都要求您进行初始猜测，并且对该猜测很敏感。我在下面p0省略了它（只需将它作为kwarg 传递给curve_fit），因为默认值a, b, c = 1, 1, 1是对a, b, c = 3, 2, 1.

With the caveats out of the way, curve_fitexpects to be passed a function, a set of points where the observations were made (as a single ndim x npointsarray), and the observed values.

排除警告后，curve_fit期望传递一个函数、一组观察点（作为单个ndim x npoints数组）和观察值。

So, if we write the function like this:

所以，如果我们这样写函数：

def func(x, y, z, a, b, c):
    f = (a**2
         + x * b**2
         + y * a * b * np.cos(c)
         + z * a * b * np.sin(c))
    return f

We'll need to wrap it to accept slightly different arguments before passing it to curve_fit.

在将它传递给curve_fit.

In a nutshell:

简而言之：

def nonlinear_invert(f, x, y, z):
    def wrapped_func(observation_points, a, b, c):
        x, y, z = observation_points
        return func(x, y, z, a, b, c)

    xdata = np.vstack([x, y, z])
    model, cov = opt.curve_fit(wrapped_func, xdata, f)
    return model

Stand-alone Example of the two methods:

两种方法的独立示例：

To give you a full implementation, here's an example that

为了给你一个完整的实现，这里有一个例子

generates randomly distributed points to evaluate the function on,
evaluates the function on those points (using set model parameters),
adds noise to the results,
and then inverts for the model parameters using both the linear and non-linear methods described above.

生成随机分布的点来评估函数，
在这些点上评估函数（使用设置的模型参数），
给结果增加噪音，
然后使用上述线性和非线性方法对模型参数进行反演。

import numpy as np
import scipy.optimize as opt

def main():
    nobservations = 4
    a, b, c = 3.0, 2.0, 1.0
    f, x, y, z = generate_data(nobservations, a, b, c)

    print 'Linear results (should be {}, {}, {}):'.format(a, b, c)
    print linear_invert(f, x, y, z)

    print 'Non-linear results (should be {}, {}, {}):'.format(a, b, c)
    print nonlinear_invert(f, x, y, z)

def generate_data(nobservations, a, b, c, noise_level=0.01):
    x, y, z = np.random.random((3, nobservations))
    noise = noise_level * np.random.normal(0, noise_level, nobservations)
    f = func(x, y, z, a, b, c) + noise
    return f, x, y, z

def func(x, y, z, a, b, c):
    f = (a**2
         + x * b**2
         + y * a * b * np.cos(c)
         + z * a * b * np.sin(c))
    return f

def linear_invert(f, x, y, z):
    G = np.vstack([np.ones_like(x), x, y, z]).T
    m, _, _, _ = np.linalg.lstsq(G, f)

    d, e, f, g = m
    a = np.sqrt(d)
    b = np.sqrt(e)
    c = np.arctan2(g, f) # Note that `c` will be in radians, not degrees
    return a, b, c

def nonlinear_invert(f, x, y, z):
    # "curve_fit" expects the function to take a slightly different form...
    def wrapped_func(observation_points, a, b, c):
        x, y, z = observation_points
        return func(x, y, z, a, b, c)

    xdata = np.vstack([x, y, z])
    model, cov = opt.curve_fit(wrapped_func, xdata, f)
    return model

main()

Answer 2

回答by Joe

You probably want to be using scipy's nonlinear solvers, they're really easy: http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html

您可能想使用 scipy 的非线性求解器，它们非常简单：http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html

在python中求解非线性方程

提问by user1171835

采纳答案by Joe Kington

Setup

设置

Linear Solution

线性解决方案

Non-linear Solution

非线性解决方案

Stand-alone Example of the two methods:

两种方法的独立示例：

回答by Joe

相关推荐

最近更新

标签

在python中求解非线性方程

提问by user1171835

采纳答案by Joe Kington

Setup

设置

Linear Solution

线性解决方案

Non-linear Solution

非线性解决方案

Stand-alone Example of the two methods:

两种方法的独立示例：

回答by Joe

相关推荐

从python列表中提取一系列数据

Python 带有进度条的 tkinter gui

Python 将 DNA 翻译成蛋白质

Python 熊猫适用，但仅适用于满足条件的行

相关推荐

最近更新

标签