Python 如何绘制直方图,使 matplotlib 中条形的高度总和为 1?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3866520/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 13:06:40  来源:igfitidea点击:

How can I plot a histogram such that the heights of the bars sum to 1 in matplotlib?

pythongraphnumpymatplotlibscipy

提问by

I'd like to plot a normalized histogram from a vector using matplotlib. I tried the following:

我想使用 matplotlib 从向量绘制归一化直方图。我尝试了以下方法:

plt.hist(myarray, normed=True)

as well as:

也:

plt.hist(myarray, normed=1)

but neither option produces a y-axis from [0, 1] such that the bar heights of the histogram sum to 1. I'd like to produce such a histogram -- how can I do it?

但是这两个选项都不会从 [0, 1] 生成 y 轴,使得直方图的条形高度总和为 1。我想生成这样的直方图——我该怎么做?

采纳答案by dtlussier

It would be more helpful if you posed a more complete working (or in this case non-working) example.

如果您提出一个更完整的工作(或在本例中为非工作)示例,将会更有帮助。

I tried the following:

我尝试了以下方法:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.randn(1000)

fig = plt.figure()
ax = fig.add_subplot(111)
n, bins, rectangles = ax.hist(x, 50, density=True)
fig.canvas.draw()
plt.show()

This will indeed produce a bar-chart histogram with a y-axis that goes from [0,1].

这确实会生成一个条形图直方图,其 y 轴从[0,1].

Further, as per the histdocumentation (i.e. ax.hist?from ipython), I think the sum is fine too:

此外,根据hist文档(即ax.hist?来自ipython),我认为总和也很好:

*normed*:
If *True*, the first element of the return tuple will
be the counts normalized to form a probability density, i.e.,
``n/(len(x)*dbin)``.  In a probability density, the integral of
the histogram should be 1; you can verify that with a
trapezoidal integration of the probability density function::

    pdf, bins, patches = ax.hist(...)
    print np.sum(pdf * np.diff(bins))

Giving this a try after the commands above:

在上面的命令之后尝试一下:

np.sum(n * np.diff(bins))

I get a return value of 1.0as expected. Remember that normed=Truedoesn't mean that the sum of the value at each bar will be unity, but rather than the integral over the bars is unity. In my case np.sum(n)returned approx 7.2767.

我得到了1.0预期的返回值。请记住,normed=True这并不意味着每个柱上的值的总和将是统一的,而是柱上的积分是统一的。在我的情况下np.sum(n)返回了大约7.2767

回答by Killer

I know this answer is too late considering the question is dated 2010 but I came across this question as I was facing a similar problem myself. As already stated in the answer, normed=True means that the total area under the histogram is equal to 1 but the sum of heights is not equal to 1. However, I wanted to, for convenience of physical interpretation of a histogram, make one with sum of heights equal to 1.

考虑到这个问题的日期是 2010 年,我知道这个答案为时已晚,但我遇到了这个问题,因为我自己也面临着类似的问题。正如答案中已经说过的那样, normed=True 意味着直方图下的总面积等于 1 但高度总和不等于 1。但是,为了方便对直方图进行物理解释,我想制作一个高度之和等于1。

I found a hint in the following question - Python: Histogram with area normalized to something other than 1

我在以下问题中找到了一个提示 - Python: Histogram with area normalized to something than 1

But I was not able to find a way of making bars mimic the histtype="step" feature hist(). This diverted me to : Matplotlib - Stepped histogram with already binned data

但是我无法找到一种方法来模拟 histt​​ype="step" 特征 hist()。这让我转向:Matplotlib - 已分箱数据的阶梯直方图

If the community finds it acceptable I should like to put forth a solution which synthesises ideas from both the above posts.

如果社区认为可以接受,我想提出一个综合上述两个帖子的想法的解决方案。

import matplotlib.pyplot as plt

# Let X be the array whose histogram needs to be plotted.
nx, xbins, ptchs = plt.hist(X, bins=20)
plt.clf() # Get rid of this histogram since not the one we want.

nx_frac = nx/float(len(nx)) # Each bin divided by total number of objects.
width = xbins[1] - xbins[0] # Width of each bin.
x = np.ravel(zip(xbins[:-1], xbins[:-1]+width))
y = np.ravel(zip(nx_frac,nx_frac))

plt.plot(x,y,linestyle="dashed",label="MyLabel")
#... Further formatting.

This has worked wonderfully for me though in some cases I have noticed that the left most "bar" or the right most "bar" of the histogram does not close down by touching the lowest point of the Y-axis. In such a case adding an element 0 at the begging or the end of y achieved the necessary result.

这对我来说非常有效,尽管在某些情况下我注意到直方图最左边的“条”或最右边的“条”不会通过触摸 Y 轴的最低点而关闭。在这种情况下,在 y 的开头或结尾添加元素 0 就达到了必要的结果。

Just thought I'd share my experience. Thank you.

只是想我会分享我的经验。谢谢你。

回答by Carsten K?nig

If you want the sum of all bars to be equal unity, weight each bin by the total number of values:

如果您希望所有条形的总和相等,请按值的总数对每个 bin 进行加权:

weights = np.ones_like(myarray) / len(myarray)
plt.hist(myarray, weights=weights)

Hope that helps, although the thread is quite old...

希望有帮助,虽然线程很旧......

Note for Python 2.x: add casting to float()for one of the operators of the division as otherwise you would end up with zeros due to integer division

Python 2.x 的注意事项:float()为除法的运算符之一添加强制转换,否则由于整数除法,您最终会得到零

回答by Yuri Brovman

Here is another simple solution using np.histogram()method.

这是使用np.histogram()方法的另一个简单解决方案。

myarray = np.random.random(100)
results, edges = np.histogram(myarray, normed=True)
binWidth = edges[1] - edges[0]
plt.bar(edges[:-1], results*binWidth, binWidth)

You can indeed check that the total sums up to 1 with:

您确实可以通过以下方式检查总和是否为 1:

> print sum(results*binWidth)
1.0