在python中求解非线性方程
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19542801/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Solving non-linear equations in python
提问by user1171835
I have 4 non-linear equations with three unknowns X
, Y
, and Z
that I want to solve for. The equations are of the form:
我有 4 个非线性方程,其中包含三个未知数X
, Y
, 和Z
我想求解。方程的形式为:
F(m) = X^2 + a(m)Y^2 + b(m)XYcosZ + c(m)XYsinZ
...where a
, b
and c
are constants which are dependent on each value of F
in the four equations.
... where a
, b
andc
是依赖于F
四个方程中每个值的常数。
What is the best way to go about solving this?
解决这个问题的最佳方法是什么?
采纳答案by Joe Kington
There are two ways to do this.
有两种方法可以做到这一点。
- Use a non-linear solver
- Linearize the problem and solve it in the least-squares sense
- 使用非线性求解器
- 将问题线性化并以最小二乘法求解
Setup
设置
So, as I understand your question, you know F, a, b, and c at 4 different points, and you want to invert for the model parameters X, Y, and Z. We have 3 unknowns and 4 observed data points, so the problem is overdetermined. Therefore, we'll be solving in the least-squares sense.
所以,正如我理解你的问题,你知道 F、a、b 和 c 在 4 个不同的点,并且你想反转模型参数 X、Y 和 Z。我们有 3 个未知数和 4 个观察到的数据点,所以问题被过度确定了。因此,我们将在最小二乘意义上进行求解。
It's more common to use the opposite terminology in this case, so let's flip your equation around. Instead of:
在这种情况下使用相反的术语更为常见,所以让我们颠倒一下你的等式。代替:
F_i = X^2 + a_i Y^2 + b_i X Y cosZ + c_i X Y sinZ
Let's write:
让我们写:
F_i = a^2 + X_i b^2 + Y_i a b cos(c) + Z_i a b sin(c)
Where we know F
, X
, Y
, and Z
at 4 different points (e.g. F_0, F_1, ... F_i
).
我们在哪里知道F
、X
、Y
和Z
4 个不同的点(例如F_0, F_1, ... F_i
)。
We're just changing the names of the variables, not the equation itself. (This is more for my ease of thinking than anything else.)
我们只是改变了变量的名称,而不是方程本身。(这比其他任何事情都更便于我思考。)
Linear Solution
线性解决方案
It's actually possible to linearize this equation. You can easily solve for a^2
, b^2
, a b cos(c)
, and a b sin(c)
. To make this a bit easier, let's relabel things yet again:
实际上可以将这个方程线性化。您可以轻松地解决a^2
,b^2
,a b cos(c)
,和a b sin(c)
。为了让这更容易一些,让我们再次重新标记:
d = a^2
e = b^2
f = a b cos(c)
g = a b sin(c)
Now the equation is a lot simpler: F_i = d + e X_i + f Y_i + g Z_i
. It's easy to do a least-squares linear inversion for d
, e
, f
, and g
. We can then get a
, b
, and c
from:
现在等式要简单得多:F_i = d + e X_i + f Y_i + g Z_i
. 这很容易做到最小二乘线性反演d
,e
,f
,和g
。然后,我们可以得到a
,b
以及c
来自:
a = sqrt(d)
b = sqrt(e)
c = arctan(g/f)
Okay, let's write this up in matrix form. We're going to translate 4 observations of (the code we'll write will take any number of observations, but let's keep it concrete at the moment):
好的,让我们把它写成矩阵形式。我们将翻译 4 个观察结果(我们将编写的代码将接受任意数量的观察结果,但现在让我们保持具体):
F_i = d + e X_i + f Y_i + g Z_i
Into:
进入:
|F_0| |1, X_0, Y_0, Z_0| |d|
|F_1| = |1, X_1, Y_1, Z_1| * |e|
|F_2| |1, X_2, Y_2, Z_2| |f|
|F_3| |1, X_3, Y_3, Z_3| |g|
Or: F = G * m
(I'm a geophysist, so we use G
for "Green's Functions" and m
for "Model Parameters". Usually we'd use d
for "data" instead of F
, as well.)
或者:(F = G * m
我是地球物理学家,所以我们使用G
“格林函数”和m
“模型参数”。通常我们也会使用d
“数据”而不是F
。)
In python, this would translate to:
在python中,这将转化为:
def invert(f, x, y, z):
G = np.vstack([np.ones_like(x), x, y, z]).T
m, _, _, _ = np.linalg.lstsq(G, f)
d, e, f, g = m
a = np.sqrt(d)
b = np.sqrt(e)
c = np.arctan2(g, f) # Note that `c` will be in radians, not degrees
return a, b, c
Non-linear Solution
非线性解决方案
You could also solve this using scipy.optimize
, as @Joe suggested. The most accessible function in scipy.optimize
is scipy.optimize.curve_fit
which uses a Levenberg-Marquardt method by default.
scipy.optimize
正如@Joe 建议的那样,您也可以使用 解决此问题。最容易访问的函数scipy.optimize
是scipy.optimize.curve_fit
默认使用 Levenberg-Marquardt 方法。
Levenberg-Marquardt is a "hill climbing" algorithm (well, it goes downhill, in this case, but the term is used anyway). In a sense, you make an initial guess of the model parameters (all ones, by default in scipy.optimize
) and follow the slope of observed - predicted
in your parameter space downhill to the bottom.
Levenberg-Marquardt 是一种“爬山”算法(好吧,在这种情况下它会走下坡路,但无论如何都会使用该术语)。从某种意义上说,你做的模型参数(所有的,默认情况下,初始的猜测scipy.optimize
),并遵循的斜率observed - predicted
在参数空间下坡底部。
Caveat:Picking the right non-linear inversion method, initial guess, and tuning the parameters of the method is very much a "dark art". You only learn it by doing it, and there are a lot of situations where things won't work properly. Levenberg-Marquardt is a good general method if your parameter space is fairly smooth (this one should be). There are a lot of others (including genetic algorithms, neural nets, etc in addition to more common methods like simulated annealing) that are better in other situations. I'm not going to delve into that part here.
警告:选择正确的非线性反演方法、初始猜测和调整方法的参数在很大程度上是一门“黑暗艺术”。你只能通过实践来学习它,并且有很多情况无法正常工作。如果您的参数空间相当平滑(这应该是),Levenberg-Marquardt 是一种很好的通用方法。还有很多其他方法(包括遗传算法、神经网络等,以及模拟退火等更常见的方法)在其他情况下更好。我不打算在这里深入研究那部分。
There is one common gotcha that some optimization toolkits try to correct for that scipy.optimize
doesn't try to handle. If your model parameters have different magnitudes (e.g. a=1, b=1000, c=1e-8
), you'll need to rescale things so that they're similar in magnitude. Otherwise scipy.optimize
's "hill climbing" algorithms (like LM) won't accurately calculate the estimate the local gradient, and will give wildly inaccurate results. For now, I'm assuming that a
, b
, and c
have relatively similar magnitudes. Also, be aware that essentially all non-linear methods require you to make an initial guess, and are sensitive to that guess. I'm leaving it out below (just pass it in as the p0
kwarg to curve_fit
) because the default a, b, c = 1, 1, 1
is a fairly accurate guess for a, b, c = 3, 2, 1
.
有一个常见的问题,一些优化工具包试图纠正它scipy.optimize
并没有尝试处理。如果您的模型参数具有不同的幅度(例如a=1, b=1000, c=1e-8
),您将需要重新调整事物的比例,使它们的幅度相似。否则scipy.optimize
的“爬山”算法(如 LM)将无法准确计算局部梯度的估计值,并且会给出非常不准确的结果。现在,我假设a
,b
以及c
具有相对类似的量值。另外,请注意,基本上所有非线性方法都要求您进行初始猜测,并且对该猜测很敏感。我在下面p0
省略了它(只需将它作为kwarg 传递给curve_fit
),因为默认值a, b, c = 1, 1, 1
是对a, b, c = 3, 2, 1
.
With the caveats out of the way, curve_fit
expects to be passed a function, a set of points where the observations were made (as a single ndim x npoints
array), and the observed values.
排除警告后,curve_fit
期望传递一个函数、一组观察点(作为单个ndim x npoints
数组)和观察值。
So, if we write the function like this:
所以,如果我们这样写函数:
def func(x, y, z, a, b, c):
f = (a**2
+ x * b**2
+ y * a * b * np.cos(c)
+ z * a * b * np.sin(c))
return f
We'll need to wrap it to accept slightly different arguments before passing it to curve_fit
.
在将它传递给curve_fit
.
In a nutshell:
简而言之:
def nonlinear_invert(f, x, y, z):
def wrapped_func(observation_points, a, b, c):
x, y, z = observation_points
return func(x, y, z, a, b, c)
xdata = np.vstack([x, y, z])
model, cov = opt.curve_fit(wrapped_func, xdata, f)
return model
Stand-alone Example of the two methods:
两种方法的独立示例:
To give you a full implementation, here's an example that
为了给你一个完整的实现,这里有一个例子
- generates randomly distributed points to evaluate the function on,
- evaluates the function on those points (using set model parameters),
- adds noise to the results,
- and then inverts for the model parameters using both the linear and non-linear methods described above.
- 生成随机分布的点来评估函数,
- 在这些点上评估函数(使用设置的模型参数),
- 给结果增加噪音,
- 然后使用上述线性和非线性方法对模型参数进行反演。
import numpy as np
import scipy.optimize as opt
def main():
nobservations = 4
a, b, c = 3.0, 2.0, 1.0
f, x, y, z = generate_data(nobservations, a, b, c)
print 'Linear results (should be {}, {}, {}):'.format(a, b, c)
print linear_invert(f, x, y, z)
print 'Non-linear results (should be {}, {}, {}):'.format(a, b, c)
print nonlinear_invert(f, x, y, z)
def generate_data(nobservations, a, b, c, noise_level=0.01):
x, y, z = np.random.random((3, nobservations))
noise = noise_level * np.random.normal(0, noise_level, nobservations)
f = func(x, y, z, a, b, c) + noise
return f, x, y, z
def func(x, y, z, a, b, c):
f = (a**2
+ x * b**2
+ y * a * b * np.cos(c)
+ z * a * b * np.sin(c))
return f
def linear_invert(f, x, y, z):
G = np.vstack([np.ones_like(x), x, y, z]).T
m, _, _, _ = np.linalg.lstsq(G, f)
d, e, f, g = m
a = np.sqrt(d)
b = np.sqrt(e)
c = np.arctan2(g, f) # Note that `c` will be in radians, not degrees
return a, b, c
def nonlinear_invert(f, x, y, z):
# "curve_fit" expects the function to take a slightly different form...
def wrapped_func(observation_points, a, b, c):
x, y, z = observation_points
return func(x, y, z, a, b, c)
xdata = np.vstack([x, y, z])
model, cov = opt.curve_fit(wrapped_func, xdata, f)
return model
main()
回答by Joe
You probably want to be using scipy's nonlinear solvers, they're really easy: http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html
您可能想使用 scipy 的非线性求解器,它们非常简单:http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html