为 Pandas Dataframe Boxplot() 设置 y 轴比例,3 个偏差?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40892300/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:32:58  来源:igfitidea点击:

Set y-axis scale for pandas Dataframe Boxplot(), 3 Deviations?

pythonpandasdataframeboxplot

提问by Python_Learner_DK

I'm trying to make a single boxplot chart area per month with different boxplots grouped by (and labeled) by industry and then have the Y-axis use a scale I dictate.

我正在尝试每月制作一个单独的箱线图区域,其中包含按行业分组(并标记)的不同箱线图,然后让 Y 轴使用我指定的比例。

In a perfect world this would be dynamic and I could set the axis to be a certain number of standard deviations from the overall mean. I could live with another type of dynamically setting the y axis but I would want it to be standard on all the 'monthly' grouped boxplots created. I don't know what the best way to handle this is yet and open to wisdom - all I know is the numbers being used now are way to large for the charts to be meaningful.

在完美的世界中,这将是动态的,我可以将轴设置为与总体平均值相差一定数量的标准差。我可以接受另一种动态设置 y 轴的类型,但我希望它成为所有创建的“每月”分组箱线图的标准。我不知道处理这个问题的最佳方法是什么并且对智慧持开放态度 - 我所知道的是现在使用的数字太大了,图表才有意义。

I've tried all kinds of code and had zero luck with the scaling of axis and the code below was as close as I could come to the graph.

我已经尝试了各种代码并且在轴的缩放方面运气为零,下面的代码与我可以得出的图形非常接近。

Here's a link to some dummy data: https://drive.google.com/open?id=0B4xdnV0LFZI1MmlFcTBweW82V0k

这是一些虚拟数据的链接:https: //drive.google.com/open?id=0B4xdnV0LFZI1MmlFcTBweW82V0k

And for the code I'm using Python 3.5:

对于我使用 Python 3.5 的代码:

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('TkAgg')
import pylab    
df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = (df.groupby('Industry'))
print(
df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10), whis=[5,95])
,pylab.show()
)

回答by AlexG

Here is a cleaned up version of your code with the solution:

这是带有解决方案的代码的清理版本:

import pandas as pd
import matplotlib.pyplot as plt

df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = df.groupby('Industry')

axes = df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10),
                   whis=[5,95], return_type='axes')
for ax in axes.values():
    ax.set_ylim(-2.5, 2.5)

plt.show()

The key is to return the subplots as axes objects and set the limits individually.

关键是将子图作为轴对象返回并单独设置限制。

回答by Padraig

Once you have established variables for the mean and the standard deviation, use:

为均值和标准差建立变量后,请使用:

plt.ylim(ymin, ymax)

plt.ylim(ymin, ymax)

to set the y-axis.

设置y轴。