Python 使用 MatPlotLib 和 Numpy 将高斯拟合到直方图 - 错误的 Y 缩放?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23447262/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Fitting a Gaussian to a histogram with MatPlotLib and Numpy - wrong Y-scaling?
提问by El Confuso
I have written the below code to fit a Gaussian curve to a histogram. It seems to work, although the Y scaling is different. What am I doing wrong?
我编写了以下代码来将高斯曲线拟合到直方图。虽然 Y 缩放不同,但它似乎有效。我究竟做错了什么?
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.mlab as mlab
list = [0,1,1,2,2,2,3,3,4]
plt.figure(1)
plt.hist(list)
plt.xlim((min(list), max(list)))
mean = np.mean(list)
variance = np.var(list)
sigma = np.sqrt(variance)
x = np.linspace(min(list), max(list),100)
plt.plot(x,mlab.normpdf(x,mean,sigma))
plt.show()
Thanks!
谢谢!
采纳答案by David Zwicker
You need to normalize the histogram, since the distribution you plot is also normalized:
您需要对直方图进行归一化,因为您绘制的分布也已归一化:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.mlab as mlab
arr = np.random.randn(100)
plt.figure(1)
plt.hist(arr, normed=True)
plt.xlim((min(arr), max(arr)))
mean = np.mean(arr)
variance = np.var(arr)
sigma = np.sqrt(variance)
x = np.linspace(min(arr), max(arr), 100)
plt.plot(x, mlab.normpdf(x, mean, sigma))
plt.show()
Note the normed=True
in the call to plt.hist
. Note also that I changed your sample data, because the histogram looks weird with too few data points.
请注意normed=True
调用中的plt.hist
。另请注意,我更改了您的示例数据,因为数据点太少的直方图看起来很奇怪。
If you instead want to keep the original histogram and rather adjust the distribution, you have to scale the distribution such that the integral over the distribution equals the integral of the histogram, i.e. the number of items in the list multiplied by the width of the bars. This can be achieved like
如果您想保留原始直方图并调整分布,则必须缩放分布,使分布上的积分等于直方图的积分,即列表中的项目数乘以条的宽度. 这可以像
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.mlab as mlab
arr = np.random.randn(1000)
plt.figure(1)
result = plt.hist(arr)
plt.xlim((min(arr), max(arr)))
mean = np.mean(arr)
variance = np.var(arr)
sigma = np.sqrt(variance)
x = np.linspace(min(arr), max(arr), 100)
dx = result[1][1] - result[1][0]
scale = len(arr)*dx
plt.plot(x, mlab.normpdf(x, mean, sigma)*scale)
plt.show()
Note the scale
factor calculated from the number of items times the width of a single bar.
请注意scale
根据项目数乘以单个条的宽度计算的因子。