Python 调整 seaborn.boxplot

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35131798/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 16:01:22  来源:igfitidea点击:

Tweaking seaborn.boxplot

pythonmatplotlibplotboxplotseaborn

提问by clstaudt

I would like to compare a set of distributions of scores (score), grouped by some categories (centrality) and colored by some other (model). I've tried the following with seaborn:

我想比较一组分数分布 ( score),按某些类别 ( centrality)分组并按其他类别 ( ) 着色model。我用seaborn尝试了以下方法:

plt.figure(figsize=(14,6))
seaborn.boxplot(x="centrality", y="score", hue="model", data=data, palette=seaborn.color_palette("husl", len(models) +1))
seaborn.despine(offset=10, trim=True)
plt.savefig("/home/i11/staudt/Eval/properties-replication-test.pdf", bbox_inches="tight")

There are some problems I have with this plot:

我对这个情节有一些问题:

  • There is a large amount of outliers and I don't like how they are drawn here. Can I remove them? Can I change the appearance to show less clutter? Can I color them at least so that their color matches the box color?
  • The modelvalue originalis special because all other distributions should be compared to the distribution of original. This should be visually reflected in the plot. Can I make originalthe first box of every group? Can I offset or mark it differently somehow? Would it be possible to draw a horizontal line through the median of each originaldistribution and through the group of boxes?
  • some of the values of scoreare very small, how to do proper scaling of the y-axis to show them?
  • 有大量异常值,我不喜欢这里绘制它们的方式。我可以删除它们吗?我可以更改外观以减少杂乱吗?我可以至少给它们上色以使它们的颜色与盒子的颜色相匹配吗?
  • modeloriginal很特殊,因为所有其他分布都应与 的分布进行比较original。这应该在绘图中直观地反映出来。我可以制作original每组的第一个盒子吗?我可以以某种方式抵消或以不同的方式标记它吗?是否可以通过每个original分布的中位数和框组绘制一条水平线?
  • 的某些值score非常小,如何正确缩放 y 轴以显示它们?

enter image description here

在此处输入图片说明

EDIT:

编辑:

Here is an example with a log-scaled y-axis - also not yet ideal. Why do the some boxes seem cut off at the low end?

这是一个带有对数缩放 y 轴的示例 - 也不理想。为什么有些盒子在低端似乎被切断了?

enter image description here

在此处输入图片说明

采纳答案by Lisa

Outlier display

异常值显示

You should be able to pass any arguments to seaborn.boxplotthat you can pass to plt.boxplot(see documentation), so you could adjust the display of the outliers by setting flierprops. Hereare some examples of what you can do with your outliers.

您应该能够将任何参数seaborn.boxplot传递给您可以传递给的参数plt.boxplot(请参阅文档),因此您可以通过设置flierprops. 以下是您可以对异常值执行的操作的一些示例。

If you don't want to display them, you could do

如果你不想显示它们,你可以这样做

seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                showfliers=False)

or you could make them light gray like so:

或者你可以像这样把它们变成浅灰色:

flierprops = dict(markerfacecolor='0.75', markersize=5,
              linestyle='none')
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                flierprops=flierprops)

Order of groups

组的顺序

You can set the order of the groups manually with hue_order, e.g.

您可以手动设置组的顺序hue_order,例如

seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                hue_order=["original", "Havel..","etc"])

Scaling of y-axis

y 轴的缩放

You could just get the minimum and maximum values of all y-values and set y_limaccordingly? Something like this:

您可以获取所有 y 值的最小值和最大值并进行相应设置y_lim吗?像这样的东西:

y_values = data["scores"].values
seaborn.boxplot(x="centrality", y="score", hue="model", data=data,
                y_lim=(np.min(y_values),np.max(y_values)))

EDIT: This last point doesn't really make sense since the automatic y_limrange will already include all the values, but I'm leaving it just as an example of how to adjust these settings. As mentioned in the comments, log-scaling probably makes more sense.

编辑:最后一点实际上没有意义,因为自动y_lim范围已经包含所有值,但我将其仅作为如何调整这些设置的示例。正如评论中提到的,日志缩放可能更有意义。