pandas Seaborn groupby 熊猫系列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25279810/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:22:01  来源:igfitidea点击:

Seaborn groupby pandas Series

matplotlibpandasseaborn

提问by Arman

I want to visualize my data into box plots that are grouped by another variable shown here in my terrible drawing:

我想将我的数据可视化为箱线图,这些箱线图由我可怕的绘图中显示的另一个变量分组:

enter image description here

在此处输入图片说明

So what I do is to use a pandas series variable to tell pandas that I have grouped variables so this is what I do:

所以我所做的是使用一个Pandas系列变量来告诉Pandas我已经对变量进行了分组,所以这就是我所做的:

import pandas as pd
import seaborn as sns
#example data for reproduciblity
a = pd.DataFrame(
[
[2, 1],
[4, 2],
[5, 1],
[10, 2],
[9, 2],
[3, 1]
])

#converting second column to Series 
a.ix[:,1] = pd.Series(a.ix[:,1])
#Plotting by seaborn
sns.boxplot(a, groupby=a.ix[:,1])

And this is what I get:

这就是我得到的:

seaborn plot

seaborn 情节

However, what I would have expected to get was to have two boxplots each describing only the first column, grouped by their corresponding column in the second column (the column converted to Series), while the above plot shows each column separately which is not what I want.

但是,我期望得到的是有两个箱线图,每个箱线图每个只描述第一列,按第二列中的相应列(转换为系列的列)分组,而上面的图分别显示每一列,这不是什么我想要。

回答by Rutger Kassies

A column in a Dataframeis already a Series, so your conversion is not necessary. Furthermore, if you only want to use the first column for both boxplots, you should only pass that to Seaborn.

a 中的列Dataframe已经是 a Series,因此不需要进行转换。此外,如果您只想将第一列用于两个箱线图,您应该只将其传递给 Seaborn。

So:

所以:

#example data for reproduciblity
df = pd.DataFrame(
[
[2, 1],
[4, 2],
[5, 1],
[10, 2],
[9, 2],
[3, 1]
], columns=['a', 'b'])

#Plotting by seaborn
sns.boxplot(df.a, groupby=df.b)

I changed your example a little bit, giving columns a label makes it a bit more clear in my opinion.

我稍微改变了你的例子,给列一个标签使它在我看来更清楚一点。

enter image description here

在此处输入图片说明

edit:

编辑:

If you want to plot all columns separately you (i think) basically want all combinations of the values in your groupbycolumn and any other column. So if you Dataframelooks like this:

如果你想分别绘制所有列,你(我认为)基本上想要你的groupby列和任何其他列中的值的所有组合。所以如果你Dataframe看起来像这样:

    a   b  grouper
0   2   5        1
1   4   9        2
2   5   3        1
3  10   6        2
4   9   7        2
5   3  11        1

And you want boxplots for columns aand bwhile grouped by the column grouper. You should flatten the columns and change the groupby column to contain values like a1, a2, b1etc.

并且您想要列的箱线图ab同时按列分组grouper。你应该扁平化的列,并更改GROUPBY列包含类似的值a1a2b1等。

Here is a crude way which i think should work, given the Dataframe shown above:

鉴于上面显示的数据框,这是我认为应该工作的粗略方法:

dfpiv = df.pivot(index=df.index, columns='grouper')

cols_flat = [dfpiv.columns.levels[0][i] + str(dfpiv.columns.levels[1][j]) for i, j in zip(dfpiv.columns.labels[0], dfpiv.columns.labels[1])]  
dfpiv.columns = cols_flat
dfpiv = dfpiv.stack(0)

sns.boxplot(dfpiv, groupby=dfpiv.index.get_level_values(1))

enter image description here

在此处输入图片说明

Perhaps there are more fancy ways of restructuring the Dataframe. Especially the flattening of the hierarchy after pivoting is hard to read, i dont like it.

或许还有更奇特的重组方式Dataframe。特别是旋转后层次结构的扁平化很难阅读,我不喜欢它。