Python Scipy：对数正态拟合

Question

提问by bioslime

There have been quite a few posts on handling the lognormdistribution with Scipy but i still dont get the hang of it.

已经有很多关于lognorm使用 Scipy处理分发的帖子，但我仍然没有掌握它的窍门。

The 2 parameter lognormal is usually described by the parameters \muand \sigmawhich corresponds to Scipys loc=0and \sigma=shape, \mu=np.log(scale).

2参数对数正态通常是由参数来描述\mu和\sigma其对应于Scipysloc=0和\sigma=shape，\mu=np.log(scale)。

At scipy, lognormal distribution - parameters, we can read how to generate a lognorm(\mu,\sigma)sample using the exponential of a random distribution. Now lets try something else:

在scipy, lognormal distribution - parameters，我们可以阅读如何lognorm(\mu,\sigma)使用随机分布的指数生成样本。现在让我们试试别的：

A)

一种）

Whats the problem in creating a lognorm directly:

直接创建 lognorm 有什么问题：

# lognorm(mu=10,sigma=3)
# so shape=3, loc=0, scale=np.exp(10) ?
x=np.linspace(0.01,20,200)
sample_dist = sp.stats.lognorm.pdf(x, 3, loc=0, scale=np.exp(10))
shape, loc, scale = sp.stats.lognorm.fit(sample_dist, floc=0)
print shape, loc, scale
print np.log(scale), shape # mu and sigma
# last line: -7.63285693379 0.140259699945  # not 10 and 3

B)

乙)

I use the return values of a fit to create a fitted distribution. But again im doing something wrong apparently:

我使用拟合的返回值来创建拟合分布。但我显然又做错了什么：

samp=sp.stats.lognorm(0.5,loc=0,scale=1).rvs(size=2000) # sample
param=sp.stats.lognorm.fit(samp) # fit the sample data
print param # does not coincide  with shape, loc, scale above!
x=np.linspace(0,4,100)
pdf_fitted = sp.stats.lognorm.pdf(x, param[0], loc=param[1], scale=param[2]) # fitted distribution
pdf = sp.stats.lognorm.pdf(x, 0.5, loc=0, scale=1) # original distribution
plt.plot(x,pdf_fitted,'r-',x,pdf,'g-')
plt.hist(samp,bins=30,normed=True,alpha=.3)

lognorm

对数范数

Answer 1

回答by bioslime

I realized my mistakes:

我意识到我的错误：

A) The samples i am drawing need to come from the .rvsmethod. Like so: sample_dist = sp.stats.lognorm.rvs(3, loc=0, scale=np.exp(10), size=2000)

A) 我正在绘制的样本需要来自该.rvs方法。像这样： sample_dist = sp.stats.lognorm.rvs(3, loc=0, scale=np.exp(10), size=2000)

B) The fit has some problems. When we fix the locparameter the fit succeeds much better. param=sp.stats.lognorm.fit(samp, floc=0)

B) 拟合有一些问题。当我们修复loc参数时，拟合效果会更好。 param=sp.stats.lognorm.fit(samp, floc=0)

Answer 2

回答by Christian K.

I made the same observations: a free fit of all parameters fails most of the time. You can help by providing a better initial guess, fixing the parameter is not necessary.

我做了同样的观察：大多数情况下，所有参数的自由拟合都失败了。您可以通过提供更好的初始猜测来提供帮助，无需修复参数。

samp = stats.lognorm(0.5,loc=0,scale=1).rvs(size=2000)

# this is where the fit gets it initial guess from
print stats.lognorm._fitstart(samp)

(1.0, 0.66628696413404565, 0.28031095750445462)

print stats.lognorm.fit(samp)
# note that the fit failed completely as the parameters did not change at all

(1.0, 0.66628696413404565, 0.28031095750445462)

# fit again with a better initial guess for loc
print stats.lognorm.fit(samp, loc=0)

(0.50146296628099118, 0.0011019321419653122, 0.99361128537912125)

You can also make up your own function to calculate the initial guess, e.g.:

您还可以编写自己的函数来计算初始猜测，例如：

def your_func(sample):
    # do some magic here
    return guess

stats.lognorm._fitstart = your_func

Answer 3

回答by Luis DG

This problem has been fixed in newer scipy versions. After upgrading scipy0.9 to scipy0.14 the problem dissapears.

此问题已在较新的 scipy 版本中修复。将 scipy0.9 升级到 scipy0.14 后，问题消失。

Answer 4

回答by nenetto

I answered in here

我在这里回答

I leave the code here too just for lazy :D

我也把代码留在这里只是为了懒惰:D

import scipy
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

mu = 10 # Mean of sample !!! Make sure your data is positive for the lognormal example 
sigma = 1.5 # Standard deviation of sample
N = 2000 # Number of samples

norm_dist = scipy.stats.norm(loc=mu, scale=sigma) # Create Random Process
x = norm_dist.rvs(size=N) # Generate samples

# Fit normal
fitting_params = scipy.stats.norm.fit(x)
norm_dist_fitted = scipy.stats.norm(*fitting_params)
t = np.linspace(np.min(x), np.max(x), 100)

# Plot normals
f, ax = plt.subplots(1, sharex='col', figsize=(10, 5))
sns.distplot(x, ax=ax, norm_hist=True, kde=False, label='Data X~N(mu={0:.1f}, sigma={1:.1f})'.format(mu, sigma))
ax.plot(t, norm_dist_fitted.pdf(t), lw=2, color='r',
        label='Fitted Model X~N(mu={0:.1f}, sigma={1:.1f})'.format(norm_dist_fitted.mean(), norm_dist_fitted.std()))
ax.plot(t, norm_dist.pdf(t), lw=2, color='g', ls=':',
        label='Original Model X~N(mu={0:.1f}, sigma={1:.1f})'.format(norm_dist.mean(), norm_dist.std()))
ax.legend(loc='lower right')
plt.show()


# The lognormal model fits to a variable whose log is normal
# We create our variable whose log is normal 'exponenciating' the previous variable

x_exp = np.exp(x)
mu_exp = np.exp(mu)
sigma_exp = np.exp(sigma)

fitting_params_lognormal = scipy.stats.lognorm.fit(x_exp, floc=0, scale=mu_exp)
lognorm_dist_fitted = scipy.stats.lognorm(*fitting_params_lognormal)
t = np.linspace(np.min(x_exp), np.max(x_exp), 100)

# Here is the magic I was looking for a long long time
lognorm_dist = scipy.stats.lognorm(s=sigma, loc=0, scale=np.exp(mu))
# Plot lognormals
f, ax = plt.subplots(1, sharex='col', figsize=(10, 5))
sns.distplot(x_exp, ax=ax, norm_hist=True, kde=False,
             label='Data exp(X)~N(mu={0:.1f}, sigma={1:.1f})\n X~LogNorm(mu={0:.1f}, sigma={1:.1f})'.format(mu, sigma))
ax.plot(t, lognorm_dist_fitted.pdf(t), lw=2, color='r',
        label='Fitted Model X~LogNorm(mu={0:.1f}, sigma={1:.1f})'.format(lognorm_dist_fitted.mean(), lognorm_dist_fitted.std()))
ax.plot(t, lognorm_dist.pdf(t), lw=2, color='g', ls=':',
        label='Original Model X~LogNorm(mu={0:.1f}, sigma={1:.1f})'.format(lognorm_dist.mean(), lognorm_dist.std()))
ax.legend(loc='lower right')
plt.show()

The trick is to understand these two things:

诀窍是理解这两件事：

If the EXP of a variable is NORMAL with MU and STD -> EXP(X) ~ scipy.stats.lognorm(s=sigma, loc=0, scale=np.exp(mu))
If your variable (x) HAS THE FORM of a LOGNORMAL, the model will be scipy.stats.lognorm(s=sigmaX, loc=0, scale=muX) with:
- muX = np.mean(np.log(x))
- sigmaX = np.std(np.log(x))

如果变量的 EXP 是 NORMAL 与 MU 和 STD -> EXP(X) ~ scipy.stats.lognorm(s=sigma, loc=0, scale=np.exp(mu))
如果您的变量 (x) 具有 LOGNORMAL 的形式，则模型将为 scipy.stats.lognorm(s=sigmaX, loc=0, scale=muX) ，其中：
- muX = np.mean(np.log(x))
- sigmaX = np.std(np.log(x))

Answer 5

回答by bart cubrich

If you are just interested in plotting you can use seaborn to get a lognormal distribution.

如果您只是对绘图感兴趣，可以使用 seaborn 来获得对数正态分布。

import seaborn as sns
import numpy as np

mu=0
sigma=1
n=1000

x=np.random.normal(mu,sigma,n)
sns.distplot(x, fit=sp_stats.norm) #normal distribution

loc=0
scale=1

x=np.log(np.random.lognormal(loc,scale,n))
sns.distplot(x, fit=sp_stats.lognorm) #log normal distribution

Python Scipy：对数正态拟合

提问by bioslime

A)

一种）

B)

乙)

回答by bioslime

回答by Christian K.

回答by Luis DG

回答by nenetto

The trick is to understand these two things:

诀窍是理解这两件事：

回答by bart cubrich

相关推荐

最近更新

标签

Python Scipy：对数正态拟合

提问by bioslime

A)

一种）

B)

乙)

回答by bioslime

回答by Christian K.

回答by Luis DG

回答by nenetto

The trick is to understand these two things:

诀窍是理解这两件事：

回答by bart cubrich

相关推荐

Python 如何从 django shell 创建用户

Python pandas 数据框创建新列并填充来自同一 df 的计算值

Python 如何在 Windows 7 上安装 SIP 和 PyQT

Python 元类：理解“with_metaclass()”

相关推荐

最近更新

标签