Python 调整 seaborn.boxplot
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35131798/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Tweaking seaborn.boxplot
提问by clstaudt
I would like to compare a set of distributions of scores (score
), grouped by some categories (centrality
) and colored by some other (model
). I've tried the following with seaborn:
我想比较一组分数分布 ( score
),按某些类别 ( centrality
)分组并按其他类别 ( ) 着色model
。我用seaborn尝试了以下方法:
plt.figure(figsize=(14,6))
seaborn.boxplot(x="centrality", y="score", hue="model", data=data, palette=seaborn.color_palette("husl", len(models) +1))
seaborn.despine(offset=10, trim=True)
plt.savefig("/home/i11/staudt/Eval/properties-replication-test.pdf", bbox_inches="tight")
There are some problems I have with this plot:
我对这个情节有一些问题:
- There is a large amount of outliers and I don't like how they are drawn here. Can I remove them? Can I change the appearance to show less clutter? Can I color them at least so that their color matches the box color?
- The
model
valueoriginal
is special because all other distributions should be compared to the distribution oforiginal
. This should be visually reflected in the plot. Can I makeoriginal
the first box of every group? Can I offset or mark it differently somehow? Would it be possible to draw a horizontal line through the median of eachoriginal
distribution and through the group of boxes? - some of the values of
score
are very small, how to do proper scaling of the y-axis to show them?
- 有大量异常值,我不喜欢这里绘制它们的方式。我可以删除它们吗?我可以更改外观以减少杂乱吗?我可以至少给它们上色以使它们的颜色与盒子的颜色相匹配吗?
- 该
model
值original
很特殊,因为所有其他分布都应与 的分布进行比较original
。这应该在绘图中直观地反映出来。我可以制作original
每组的第一个盒子吗?我可以以某种方式抵消或以不同的方式标记它吗?是否可以通过每个original
分布的中位数和框组绘制一条水平线? - 的某些值
score
非常小,如何正确缩放 y 轴以显示它们?
EDIT:
编辑:
Here is an example with a log-scaled y-axis - also not yet ideal. Why do the some boxes seem cut off at the low end?
这是一个带有对数缩放 y 轴的示例 - 也不理想。为什么有些盒子在低端似乎被切断了?
采纳答案by Lisa
Outlier display
异常值显示
You should be able to pass any arguments to seaborn.boxplot
that you can pass to plt.boxplot
(see documentation), so you could adjust the display of the outliers by setting flierprops
. Hereare some examples of what you can do with your outliers.
您应该能够将任何参数seaborn.boxplot
传递给您可以传递给的参数plt.boxplot
(请参阅文档),因此您可以通过设置flierprops
. 以下是您可以对异常值执行的操作的一些示例。
If you don't want to display them, you could do
如果你不想显示它们,你可以这样做
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
showfliers=False)
or you could make them light gray like so:
或者你可以像这样把它们变成浅灰色:
flierprops = dict(markerfacecolor='0.75', markersize=5,
linestyle='none')
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
flierprops=flierprops)
Order of groups
组的顺序
You can set the order of the groups manually with hue_order
, e.g.
您可以手动设置组的顺序hue_order
,例如
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
hue_order=["original", "Havel..","etc"])
Scaling of y-axis
y 轴的缩放
You could just get the minimum and maximum values of all y-values and set y_lim
accordingly? Something like this:
您可以获取所有 y 值的最小值和最大值并进行相应设置y_lim
吗?像这样的东西:
y_values = data["scores"].values
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
y_lim=(np.min(y_values),np.max(y_values)))
EDIT: This last point doesn't really make sense since the automatic y_lim
range will already include all the values, but I'm leaving it just as an example of how to adjust these settings. As mentioned in the comments, log-scaling probably makes more sense.
编辑:最后一点实际上没有意义,因为自动y_lim
范围已经包含所有值,但我将其仅作为如何调整这些设置的示例。正如评论中提到的,日志缩放可能更有意义。