pandas Python Matplotlib 绘制样本在条形图中具有置信区间但看起来像箱线图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43016380/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Matplotlib plotting sample means in bar chart with confidence intervals but looks like box plots
提问by Chris T.
I want to plot the means of four time-series into a Matplotlib bar chart with confidence intervals. Also I want to color them differently, to generate a bar chart like this
我想将四个时间序列的均值绘制成带有置信区间的 Matplotlib 条形图。我也想用不同的颜色给它们上色,以生成这样的条形图
So I wrote the following code:
所以我写了下面的代码:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(12345)
df = pd.DataFrame([np.random.normal(-10, 200, 100),
np.random.normal(42, 150, 100),
np.random.normal(0, 120, 100),
np.random.normal(-5, 57, 100)],
index=[2012, 2013, 2014, 2015])
years = ('2012', '2013', '2014', '2015')
y_pos = np.arange(len(years))
df1_mean = df.iloc[0].mean()
df1_std = df.iloc[0].std()
df2_mean = df.iloc[1].mean()
df2_std = df.iloc[1].std()
df3_mean = df.iloc[2].mean()
df3_std = df.iloc[2].std()
df4_mean = df.iloc[3].mean()
df4_std = df.iloc[3].std()
value = (df1_mean, df2_mean, df3_mean, df4_mean)
Std = (df1_std, df2_std, df3_std, df4_std)
plt.bar(y_pos, value, yerr=Std, align='center', alpha=0.5)
plt.xticks(y_pos, years)
plt.ylabel('Stock price')
plt.title('Something')
plt.show()
which gives me this (see the above). Not quite what I was expecting. Also, it looks like a box plot instead of a bar chart where each sample means should go all the way down to x-axis.
这给了我这个(见上文)。不完全是我的期望。此外,它看起来像一个箱线图而不是一个条形图,其中每个样本均值应该一直向下延伸到 x 轴。
I admit I am really new to Matplotlib, but I really would like to know what's going on with my code. It's supposed to be a simple task, but I can't seem to get it. Should I invoke .subplots() command instead? On top of that, I will really appreciate if someone would be kind enough to point me how to (1) add a horizontal line on the x-axis (say, on the value of 100) on the same bar chart as a threshold value, and (2) color these four bar differently (the exact color of choice doesn't really matter)?
我承认我对 Matplotlib 真的很陌生,但我真的很想知道我的代码发生了什么。这应该是一项简单的任务,但我似乎无法理解。我应该调用 .subplots() 命令吗?最重要的是,如果有人能告诉我如何 (1) 在同一个条形图上的 x 轴上(例如,在 100 的值上)添加一条水平线作为阈值,我将非常感激, 和 (2) 为这四个条着色不同的颜色(选择的确切颜色并不重要)?
Thank you.
谢谢你。
回答by ImportanceOfBeingErnest
By default the bars created by plt.bar
start at y=0
. For positive values they expand upwards, for negative they expand downwards.
You can have them start at a different value by using the bottom
argument and add the amount of bottom
to the values. This is done in the following code where I also brought the dataframe in a more usual shape (years are columns).
默认情况下,由创建的条形plt.bar
开始于y=0
。对于正值,它们向上扩展,对于负值,它们向下扩展。
您可以通过使用bottom
参数将它们从不同的值开始,并将 的数量添加bottom
到值中。这是在以下代码中完成的,其中我还将数据框置于更常见的形状(年是列)。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(12345)
df = pd.DataFrame(np.c_[np.random.normal(-10,200,100),
np.random.normal(42,150,100),
np.random.normal(0,120,100),
np.random.normal(-5,57,100)],
columns=[2012,2013,2014,2015])
value = df.mean()
std = df.std()
colors=["red", "green", "blue", "purple"]
plt.axhline(y=100, zorder=0)
plt.bar(range(len(df.columns)), value+np.abs(df.values.min()), bottom=df.values.min(),
yerr=std, align='center', alpha=0.5, color=colors)
plt.xticks(range(len(df.columns)), df.columns)
plt.ylabel('Stock price')
plt.title('Something')
plt.show()