Python：在seaborn条形图中绘制百分比

Question

提问by PagMax

For a dataframe

对于数据框

import pandas as pd
df=pd.DataFrame({'group':list("AADABCBCCCD"),'Values':[1,0,1,0,1,0,0,1,0,1,0]})

I am trying to plot a barplot showing percentage of times A, B, C, Dtakes zero (or one).

我正在尝试绘制一个条形图，显示时间百分比A, B, C, D为零（或一）。

I have a round about way which works but I am thinking there has to be more straight forward way

我有一个可行的方法，但我认为必须有更直接的方法

tempdf=df.groupby(['group','Values']).Values.count().unstack().fillna(0)
tempdf['total']=df['group'].value_counts()
tempdf['percent']=tempdf[0]/tempdf['total']*100

tempdf.reset_index(inplace=True)
print tempdf

sns.barplot(x='group',y='percent',data=tempdf)

If it were plotting just the mean value, I could simply do sns.barploton dfdataframe than tempdf. I am not sure how to do it elegantly if I am interested in plotting percentages.

如果它只是绘制平均值，我可以简单地sns.barplot在df数据帧上做而不是 tempdf。如果我对绘制百分比感兴趣，我不确定如何优雅地做到这一点。

Thanks,

谢谢，

Answer 1

回答by mgoldwasser

You can use Pandas in conjunction with seaborn to make this easier:

您可以将 Pandas 与 seaborn 结合使用以简化此操作：

import pandas as pd
import seaborn as sns

df = sns.load_dataset("tips")
x, y, hue = "day", "proportion", "sex"
hue_order = ["Male", "Female"]

(df[x]
 .groupby(df[hue])
 .value_counts(normalize=True)
 .rename(y)
 .reset_index()
 .pipe((sns.barplot, "data"), x=x, y=y, hue=hue))

Answer 2

回答by Anton Protopopov

You could use your own function in sns.barplotestimator, as from docs:

您可以在中使用自己的函数sns.barplotestimator，如文档所示：

estimator: callable that maps vector -> scalar, optional
Statistical function to estimate within each categorical bin.

estimator：可调用的映射向量 -> 标量，可选的
统计函数以在每个分类箱内进行估计。

For you case you could define function as lambda:

对于您的情况，您可以将函数定义为 lambda：

sns.barplot(x='group', y='Values', data=df, estimator=lambda x: sum(x==0)*100.0/len(x))

Answer 3

回答by Ted Petrou

You can use the library Dexplot, which has the ability to return relative frequencies for categorical variables. It has a similar API to Seaborn. Pass the column you would like to get the relative frequency for to the aggparameter. If you would like to subdivide this by another column, do so with the hueparameter. The following returns raw counts.

您可以使用库 Dexplot，它能够返回分类变量的相对频率。它具有与 Seaborn 类似的 API。将您想要获取相对频率的列传递给agg参数。如果您想将其细分为另一列，请使用hue参数执行此操作。以下返回原始计数。

import dexplot as dxp
dxp.aggplot(agg='group', data=df, hue='Values')

To get the relative frequencies, set the normalizeparameter to the column you want to normalize over. Use 'all'to normalize over the overall total count.

要获得相对频率，请将normalize参数设置为要标准化的列。使用'all'标准化在整体总数。

dxp.aggplot(agg='group', data=df, hue='Values', normalize='group')

Normalizing over the 'Values'column would produce the following graph, where the total of all the '0' bars are 1.

对该'Values'列进行归一化将产生下图，其中所有“0”条的总和为 1。

dxp.aggplot(agg='group', data=df, hue='Values', normalize='Values')

Answer 4

回答by Deepak Natarajan

You can follow these steps so that you can see the count and percentages on top of the bars in your plot. Check the example outputs down below

您可以按照以下步骤操作，以便您可以看到图中条形顶部的计数和百分比。检查下面的示例输出

with_huefunction will plot percentages on the bar graphs if you have the 'hue' parameter in your plots. It takes the actual graph, feature, Number_of_categories in feature, and hue_categories(number of categories in hue feature) as a parameter.

如果您的图中有“色调”参数，with_hue函数将在条形图上绘制百分比。它以实际图形、特征、特征中的 Number_of_categories 和hue_categories（色调特征中的类别数）作为参数。

without_huefunction will plot percentages on the bar graphs if you have a normal plot. It takes the actual graph and feature as a parameter.

如果您有正常绘图，without_hue函数将在条形图上绘制百分比。它以实际图形和特征为参数。

def with_hue(plot, feature, Number_of_categories, hue_categories):
    a = [p.get_height() for p in plot.patches]
    patch = [p for p in plot.patches]
    for i in range(Number_of_categories):
        total = feature.value_counts().values[i]
        for j in range(hue_categories):
            percentage = '{:.1f}%'.format(100 * a[(j*Number_of_categories + i)]/total)
            x = patch[(j*Number_of_categories + i)].get_x() + patch[(j*Number_of_categories + i)].get_width() / 2 - 0.15
            y = patch[(j*Number_of_categories + i)].get_y() + patch[(j*Number_of_categories + i)].get_height() 
            ax.annotate(percentage, (x, y), size = 12)
    plt.show()

def without_hue(plot, feature):
    total = len(feature)
    for p in ax.patches:
        percentage = '{:.1f}%'.format(100 * p.get_height()/total)
        x = p.get_x() + p.get_width() / 2 - 0.05
        y = p.get_y() + p.get_height()
        ax.annotate(percentage, (x, y), size = 12)
    plt.show()

Python：在seaborn条形图中绘制百分比

提问by PagMax

回答by mgoldwasser

回答by Anton Protopopov

回答by Ted Petrou

回答by Deepak Natarajan

相关推荐

最近更新

标签

Python：在seaborn条形图中绘制百分比

提问by PagMax

回答by mgoldwasser

回答by Anton Protopopov

回答by Ted Petrou

回答by Deepak Natarajan

相关推荐

Python pandas：TimeGrouper 的文档在哪里？

Python 如何在 Jupyter Notebook 中更改工作目录？

Python 类型错误：无法连接非 NDFrame 对象

Python 中的 Selenium PhantomJS 自定义标头

相关推荐

最近更新

标签