Python 以 y 轴为百分比绘制直方图(使用 FuncFormatter?)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51473993/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Plot an histogram with y-axis as percentage (using FuncFormatter?)
提问by Mathieu
I have a list of data in which the numbers are between 1000 and 20 000.
我有一个数据列表,其中的数字在 1000 到 20 000 之间。
data = [1000, 1000, 5000, 3000, 4000, 16000, 2000]
When I plot a histogram using the hist()
function, the y-axis represents the number of occurrences of the values within a bin. Instead of the number of occurrences, I would like to have the percentage of occurrences.
当我使用该hist()
函数绘制直方图时,y 轴表示值在 bin 内出现的次数。而不是出现次数,我想要出现的百分比。
Code for the above plot:
上图的代码:
f, ax = plt.subplots(1, 1, figsize=(10,5))
ax.hist(data, bins = len(list(set(data))))
I've been looking at this postwhich describes an example using FuncFormatter
but I can't figure out how to adapt it to my problem. Some help and guidance would be welcome :)
我一直在看这篇文章,它描述了一个使用示例,FuncFormatter
但我不知道如何使它适应我的问题。欢迎提供一些帮助和指导:)
EDIT:Main issue with the to_percent(y, position)
function used by the FuncFormatter
. The y corresponds to one given value on the y-axis I guess. I need to divide this value by the total number of elements which I apparently can' t pass to the function...
编辑:与主要问题to_percent(y, position)
由被使用的功能FuncFormatter
。我猜 y 对应于 y 轴上的一个给定值。我需要将此值除以我显然无法传递给函数的元素总数...
EDIT 2:Current solution I dislike because of the use of a global variable:
编辑 2:由于使用全局变量,我不喜欢当前的解决方案:
def to_percent(y, position):
# Ignore the passed in position. This has the effect of scaling the default
# tick locations.
global n
s = str(round(100 * y / n, 3))
print (y)
# The percent symbol needs escaping in latex
if matplotlib.rcParams['text.usetex'] is True:
return s + r'$\%$'
else:
return s + '%'
def plotting_hist(folder, output):
global n
data = list()
# Do stuff to create data from folder
n = len(data)
f, ax = plt.subplots(1, 1, figsize=(10,5))
ax.hist(data, bins = len(list(set(data))), rwidth = 1)
formatter = FuncFormatter(to_percent)
plt.gca().yaxis.set_major_formatter(formatter)
plt.savefig("{}.png".format(output), dpi=500)
EDIT 3:Method with density = True
编辑 3:方法与density = True
Actual desired output (method with global variable):
实际所需的输出(具有全局变量的方法):
回答by ImportanceOfBeingErnest
Other answers seem utterly complicated. A histogram which shows the proportion instead of the absolute amount can easily produced by weighting the data with 1/n
, where n
is the number of datapoints.
其他答案似乎完全复杂。通过使用 对数据进行加权,可以很容易地生成显示比例而不是绝对数量的直方图1/n
,其中n
是数据点的数量。
Then a PercentFormatter
can be used to show the proportion (e.g. 0.45
) as percentage (45%
).
然后 aPercentFormatter
可用于将比例(例如0.45
)显示为百分比 ( 45%
)。
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
data = [1000, 1000, 5000, 3000, 4000, 16000, 2000]
plt.hist(data, weights=np.ones(len(data)) / len(data))
plt.gca().yaxis.set_major_formatter(PercentFormatter(1))
plt.show()
Here we see that three of the 7 values are in the first bin, i.e. 3/7=43%.
在这里我们看到 7 个值中的三个在第一个 bin 中,即 3/7=43%。
回答by DavidG
You can calculate the percentages yourself, then plot them as a bar chart. This requires you to use numpy.histogram
(which matplotlib uses "under the hood" anyway). You can then adjust the y tick labels:
您可以自己计算百分比,然后将它们绘制为条形图。这需要您使用numpy.histogram
(无论如何,matplotlib 使用“幕后”)。然后,您可以调整 y 刻度标签:
import matplotlib.pyplot as plt
import numpy as np
f, ax = plt.subplots(1, 1, figsize=(10,5))
data = [1000, 1000, 5000, 3000, 4000, 16000, 2000]
heights, bins = np.histogram(data, bins = len(list(set(data))))
percent = [i/sum(heights)*100 for i in heights]
ax.bar(bins[:-1], percent, width=2500, align="edge")
vals = ax.get_yticks()
ax.set_yticklabels(['%1.2f%%' %i for i in vals])
plt.show()
回答by Georgy
You can use functools.partial
to avoid using global
s in your example.
您functools.partial
可以避免global
在示例中使用s 。
Just add n
to function parameters:
只需添加n
到函数参数:
def to_percent(y, position, n):
s = str(round(100 * y / n, 3))
if matplotlib.rcParams['text.usetex']:
return s + r'$\%$'
return s + '%'
and then create a partial function of two arguments that you can pass to FuncFormatter
:
然后创建一个包含两个参数的偏函数,您可以将其传递给FuncFormatter
:
percent_formatter = partial(to_percent,
n=len(data))
formatter = FuncFormatter(percent_formatter)
Full code:
完整代码:
from functools import partial
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
data = [1000, 1000, 5000, 3000, 4000, 16000, 2000]
def to_percent(y, position, n):
s = str(round(100 * y / n, 3))
if matplotlib.rcParams['text.usetex']:
return s + r'$\%$'
return s + '%'
def plotting_hist(data):
f, ax = plt.subplots(figsize=(10, 5))
ax.hist(data,
bins=len(set(data)),
rwidth=1)
percent_formatter = partial(to_percent,
n=len(data))
formatter = FuncFormatter(percent_formatter)
plt.gca().yaxis.set_major_formatter(formatter)
plt.show()
plotting_hist(data)
gives:
给出: