Python 如何在 matplotlib 中可视化 95% 置信区间?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20033396/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 19:19:09  来源:igfitidea点击:

How to visualize 95% confidence interval in matplotlib?

pythonmatplotlibstatistics

提问by

I have learned how to find the 95% confidence interval with scipy.stats.tlike so

我已经学会了如何scipy.stats.t像这样找到 95% 的置信区间

In [1]: from scipy.stats import t
In [2]: t.interval(0.95, 10, loc=1, scale=2)  # 95% confidence interval
Out[2]: (-3.4562777039298762, 5.4562777039298762)
In [3]: t.interval(0.99, 10, loc=1, scale=2)  # 99% confidence interval
Out[3]: (-5.338545334351676, 7.338545334351676)

However, visualization is important to me. I am wondering how may I show the confidence interval bar on each node of my curve in matplotlib?

然而,可视化对我来说很重要。我想知道如何在曲线的每个节点上显示置信区间条matplotlib

What I am expecting is something like this

我期待的是这样的

enter image description here

在此处输入图片说明

回答by CT Zhu

You don't need .intervalmethod, to get the sizeof confidence interval, you just need the .ppfmethod.

你不需要.interval方法,要得到置信区间的大小,你只需要.ppf方法。

import numpy as np
import scipy.stats as ss
data_m=np.array([1,2,3,4])   #(Means of your data)
data_df=np.array([5,6,7,8])   #(Degree-of-freedoms of your data)
data_sd=np.array([11,12,12,14])   #(Standard Deviations of your data)
import matplotlib.pyplot as plt
plt.errorbar([0,1,2,3], data_m, yerr=ss.t.ppf(0.95, data_df)*data_sd)
plt.xlim((-1,4))

ss.t.ppf(0.95, data_df)*data_sdis a fully vectorize way to get the (half) size of interval, given the degrees of freedom and standard deviation.

ss.t.ppf(0.95, data_df)*data_sd给定自由度和标准偏差,是一种完全矢量化的方式来获得区间的(一半)大小。

enter image description here

在此处输入图片说明

回答by Ivan

you need to divide by standard deviation, and, second, if your data is two-sided (as plot suggests), you need to allow 2.5% of misses on each side of Gaussian, that is:

您需要除以标准差,其次,如果您的数据是双面的(如图所示),您需要在高斯的每一侧允许 2.5% 的未命中,即:

ss.t.ppf(0.975, data_df)/np.sqrt(data_df)

Since you miss 2.5% on both sides, you get total 5% miss.

由于您在双方都错过了 2.5%,因此您总共错过了 5%。