pandas 来自 DataFrames 的点箱线图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23519135/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Dot-boxplots from DataFrames
提问by Amelio Vazquez-Reina
Dataframes in Pandas have a boxplotmethod, but is there any way to create dot-boxplotsin Pandas, or otherwise with seaborn?
Pandas 中的 Dataframes 有一个 boxplot方法,但是有什么方法可以在 Pandas 中创建点箱图,或者使用seaborn创建点箱图?
By a dot-boxplot, I mean a boxplot that shows the actual data points(or a relevant sample of them) inside the plot, e.g. like the example below (obtained in R).
通过点箱图,我的意思是一个箱线图,它显示了图中的实际数据点(或它们的相关样本),例如,如下面的示例(在 R 中获得)。


回答by jrjc
For a more precise answer related to OP's question (with Pandas):
有关 OP 问题的更准确答案(使用 Pandas):
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.DataFrame({ "A":np.random.normal(0.8,0.2,20),
"B":np.random.normal(0.8,0.1,20),
"C":np.random.normal(0.9,0.1,20)} )
data.boxplot()
for i,d in enumerate(data):
y = data[d]
x = np.random.normal(i+1, 0.04, len(y))
plt.plot(x, y, mfc = ["orange","blue","yellow"][i], mec='k', ms=7, marker="o", linestyle="None")
plt.hlines(1,0,4,linestyle="--")


Old version (more generic) :
旧版本(更通用):
With matplotlib :
使用 matplotlib :
import numpy as np
import matplotlib.pyplot as plt
a = np.random.normal(0,2,1000)
b = np.random.normal(-2,7,100)
data = [a,b]
plt.boxplot(data) # Or you can use the boxplot from Pandas
for i in [1,2]:
y = data[i-1]
x = np.random.normal(i, 0.02, len(y))
plt.plot(x, y, 'r.', alpha=0.2)
Which gives that :

这给出了:

Inspired from this tutorial
灵感来自本教程
Hope this helps !
希望这可以帮助 !
回答by mwaskom
This will be possible with seaborn version 0.6 (currently in the master branch on github) using the stripplotfunction. Here's an example:
使用该stripplot函数的seaborn 0.6 版(目前在 github 上的 master 分支中)可以实现这一点。下面是一个例子:
import seaborn as sns
tips = sns.load_dataset("tips")
sns.boxplot(x="day", y="total_bill", data=tips)
sns.stripplot(x="day", y="total_bill", data=tips,
size=4, jitter=True, edgecolor="gray")



