pandas 如何在python中的同一个图形上绘制多个密度图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43463438/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:24:56  来源:igfitidea点击:

How to plot multiple density plots on the same figure in python

pythonpandasmatplotlibplotprobability-density

提问by JAG2024

I know this is going to end up being a really messy plot, but I am curious to know what the most efficient way to do this is. I have some data that looks like this in a csv file:

我知道这最终会成为一个非常混乱的情节,但我很想知道最有效的方法是什么。我在 csv 文件中有一些看起来像这样的数据:

    ROI          Band   Min         Max         Mean        Stdev
1   red_2        Band 1 0.032262    0.124425    0.078073    0.028031
2   red_2        Band 2 0.021072    0.064156    0.037923    0.012178
3   red_2        Band 3 0.013404    0.066043    0.036316    0.014787
4   red_2        Band 4 0.005162    0.055781    0.015526    0.013255
5   red_3        Band 1 0.037488    0.10783     0.057892    0.018964
6   red_3        Band 2 0.02814     0.07237     0.04534     0.014507
7   red_3        Band 3 0.01496     0.112973    0.032751    0.026575
8   red_3        Band 4 0.006566    0.029133    0.018201    0.006897
9   red_4        Band 1 0.022841    0.148666    0.065844    0.0336
10  red_4        Band 2 0.018651    0.175298    0.046383    0.042339
11  red_4        Band 3 0.012256    0.045111    0.024035    0.009711
12  red_4        Band 4 0.001493    0.033822    0.014678    0.007788
13  red_5        Band 1 0.030513    0.18098     0.090056    0.044456
37  bcs_1        Band 1 0.013059    0.076753    0.037674    0.023172
38  bcs_1        Band 2 0.035227    0.08826     0.057672    0.015005
39  bcs_1        Band 3 0.005223    0.028459    0.010836    0.006003
40  bcs_1        Band 4 0.009804    0.031457    0.018094    0.007136
41  bcs_2        Band 1 0.018134    0.083854    0.040654    0.018333
42  bcs_2        Band 2 0.016123    0.088613    0.045742    0.020168
43  bcs_2        Band 3 0.008065    0.030557    0.014596    0.007435
44  bcs_2        Band 4 0.004789    0.016514    0.009815    0.003241
45  bcs_3        Band 1 0.021092    0.077993    0.037246    0.013696
46  bcs_3        Band 2 0.011918    0.068825    0.028775    0.013758
47  bcs_3        Band 3 0.003969    0.021714    0.011336    0.004964
48  bcs_3        Band 4 0.003053    0.015763    0.006283    0.002425
49  bcs_4        Band 1 0.024466    0.079989    0.049291    0.018032
50  bcs_4        Band 2 0.009274    0.093137    0.041979    0.019347
51  bcs_4        Band 3 0.006874    0.027214    0.014386    0.005386
52  bcs_4        Band 4 0.005679    0.026662    0.014529    0.006505

And I want to create one probability density plot with 8 lines: 4 of which the 4 bands for "red" and the other will be the 4 bands for "black".So far I have this for just Band 1 in both red and black ROIs. But my code outputs two different plots. I have tried using subplot but that has not worked for me.

我想用 8 条线创建一个概率密度图:其中 4 个“红色”波段,另一个将是“黑色”的 4 个波段。到目前为止,我只有红色和黑色的波段 1投资回报率。但是我的代码输出了两个不同的图。我试过使用 subplot 但这对我没有用。

Help? I know my approach is verbose and clunky, so smarter solutions much appreciated!

帮助?我知道我的方法冗长而笨拙,因此非常感谢更智能的解决方案!

Load packages

加载包

import csv 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

files = ['example.csv']

Organize the data

整理资料

for f in files:
    fn = f.split('.')[0]
    dat = pd.read_csv(f)
    df0 = dat.loc[:, ['ROI', 'Band', 'Mean']]
    # parse by soil type
    red = df0[df0['ROI'].str.contains("red")]
    black = df0[df0['ROI'].str.contains("bcs")]
    # parse by band 
    red.b1 = red[red['Band'].str.contains("Band 1")]
    red.b2 = red[red['Band'].str.contains("Band 2")]
    red.b3 = red[red['Band'].str.contains("Band 3")]
    red.b4 = red[red['Band'].str.contains("Band 4")]
    black.b1 = black[black['Band'].str.contains("Band 1")]
    black.b2 = black[black['Band'].str.contains("Band 2")]
    black.b3 = black[black['Band'].str.contains("Band 3")]
    black.b4 = black[black['Band'].str.contains("Band 4")]

Plot the figure

绘制图形

pd.DataFrame(black.b1).plot(kind="density")
pd.DataFrame(red.b1).plot(kind="density")
plt.show()

enter image description here

在此处输入图片说明

I'd like for the figure to have 8 lines on it.

我希望图中有 8 行。

回答by piRSquared

groupby+ str.split

groupby+ str.split

df.groupby([df.ROI.str.split('_').str[0], 'Band']).Mean.plot.kde();

enter image description here

在此处输入图片说明

If you want a legend

如果你想要传奇

df.groupby([df.ROI.str.split('_').str[0], 'Band']).Mean.plot.kde()
plt.legend();

enter image description here

在此处输入图片说明

回答by Robbie

Something to help lead you in the right direction:

可以帮助您朝着正确方向前进的东西:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame()
for i in range(8):
    mean = 5-10*np.random.rand()
    std = 6*np.random.rand()
    df['score_{0}'.format(i)] = np.random.normal(mean, std, 60)

fig, ax = plt.subplots(1,1)
for s in df.columns:
    df[s].plot(kind='density')
fig.show()

Basically just looping through the columns, and plotting as you go. Having more control over the figure is very helpful.

基本上只是循环遍历列,然后随心所欲地绘图。对数字有更多的控制是非常有帮助的。

enter image description here

在此处输入图片说明