python:具有多个分布的distplot

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46045750/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:26:05  来源:igfitidea点击:

python: distplot with multiple distributions

pythonseaborn

提问by Trexion Kameha

I am using seaborn to plot a distribution plot. I would like to plot multiple distributions on the same plot in different colors:

我正在使用 seaborn 绘制分布图。我想用不同的颜色在同一个图上绘制多个分布:

Here's how I start the distribution plot:

这是我如何开始分布图:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],columns= iris['feature_names'] + ['target'])

sns.distplot(iris[['sepal length (cm)']], hist=False, rug=True);

The 'target' column contains 3 values: 0,1,2.

“目标”列包含 3 个值:0、1、2。

I would like to see one distribution plot for sepal length where target ==0, target ==1, and target ==2 for a total of 3 plots.

我想看到一个萼片长度分布图,其中目标 ==0、目标 ==1 和目标 ==2,总共 3 个图。

Does anyone know how I do that?

有谁知道我是怎么做到的?

Thank you.

谢谢你。

采纳答案by Arda Arslan

The important thing is to sort the dataframe by values where targetis 0, 1, or 2.

重要的是要排序值,其中数据帧target012

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                    columns=iris['feature_names'] + ['target'])

# Sort the dataframe by target
target_0 = iris.loc[iris['target'] == 0]
target_1 = iris.loc[iris['target'] == 1]
target_2 = iris.loc[iris['target'] == 2]

sns.distplot(target_0[['sepal length (cm)']], hist=False, rug=True)
sns.distplot(target_1[['sepal length (cm)']], hist=False, rug=True)
sns.distplot(target_2[['sepal length (cm)']], hist=False, rug=True)

sns.plt.show()

The output looks like:

输出看起来像:

enter image description here

在此处输入图片说明

If you don't know how many values targetmay have, find the unique values in the targetcolumn, then slice the dataframe and add to the plot appropriately.

如果您不知道target可能有多少个值,请在target列中找到唯一值,然后对数据框进行切片并适当地添加到图中。

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                    columns=iris['feature_names'] + ['target'])

unique_vals = iris['target'].unique()  # [0, 1, 2]

# Sort the dataframe by target
# Use a list comprehension to create list of sliced dataframes
targets = [iris.loc[iris['target'] == val] for val in unique_vals]

# Iterate through list and plot the sliced dataframe
for target in targets:
    sns.distplot(target[['sepal length (cm)']], hist=False, rug=True)

sns.plt.show()

回答by Abbas

A more common approach for this type of problems is to recast your data into long format using melt, and then let map do the rest.

解决此类问题的一种更常见的方法是使用melt 将数据重新转换为长格式,然后让map 完成剩下的工作。

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']], 
                    columns=iris['feature_names'] + ['target'])

# recast into long format 
df = iris.melt(['target'], var_name='cols',  value_name='vals')

df.head()

   target               cols  vals
0     0.0  sepal length (cm)   5.1
1     0.0  sepal length (cm)   4.9
2     0.0  sepal length (cm)   4.7
3     0.0  sepal length (cm)   4.6
4     0.0  sepal length (cm)   5.0

You can now plot simply by creating a FacetGrid and using map:

您现在可以通过创建 FacetGrid 并使用地图来简单地绘制:

g = sns.FacetGrid(df, col='cols', hue="target", palette="Set1")
g = (g.map(sns.distplot, "vals", hist=False, rug=True))

enter image description here

在此处输入图片说明

回答by toliveira

I have found a simpler solution using FacetGridon https://github.com/mwaskom/seaborn/issues/861by citynorman:

我发现用简单的解决方案FacetGridhttps://github.com/mwaskom/seaborn/issues/861通过citynorman

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],columns= iris['feature_names'] + ['target'])

g = sns.FacetGrid(iris, hue="target")
g = g.map(sns.distplot, "sepal length (cm)",  hist=False, rug=True)

enter image description here

在此处输入图片说明