从 Pandas Groupby 数据框创建轮廓图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24032282/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:07:38  来源:igfitidea点击:

Create Contour Plot from Pandas Groupby Dataframe

pythonmatplotlibpandasgroup-bycontour

提问by Balzer82

I have following Pandas Dataframe:

我有以下Pandas数据框:

In [66]: hdf.size()
Out[66]:
a           b
0           0.0          21004
            0.1         119903
            0.2         186579
            0.3         417349
            0.4         202723
            0.5         100906
            0.6          56386
            0.7           6080
            0.8           3596
            0.9           2391
            1.0           1963
            1.1           1730
            1.2           1663
            1.3           1614
            1.4           1309
...
186         0.2         15
            0.3          9
            0.4         21
            0.5          4
187         0.2          3
            0.3         10
            0.4         22
            0.5         10
188         0.0         11
            0.1         19
            0.2         20
            0.3         13
            0.4          7
            0.5          5
            0.6          1
Length: 4572, dtype: int64

You see, a from 0...188 and b in every group from some value to some value. And as the designated Z-value, the count of the occurence of the pair a/b.

你看,a 从 0...188 和 b 在每个组中从某个值到某个值。并且作为指定的 Z 值,对 a/b 的出现次数。

How to get a countour or heatmap plot out of the grouped dataframe?

如何从分组数据框中获取计数或热图?

I have this (asking for the ?):

我有这个(要求?):

numcols, numrows = 30, 30
xi = np.linspace(0, 200, numcols)
yi = np.linspace(0, 6, numrows)
xi, yi = np.meshgrid(xi, yi)
zi = griddata(?, ?, hdf.size().values, xi, yi)

How to get the x and y values out of the Groupby object and plot a contour?

如何从 Groupby 对象中获取 x 和 y 值并绘制轮廓?

回答by Balzer82

Thanks a lot! My fault was, that I did not realize, that I have to apply some function to the groupby dataframe, like .size(), to work with it...

非常感谢!我的错是,我没有意识到,我必须对 groupby 数据框应用一些函数,例如.size(),才能使用它...

hdf = aggdf.groupby(['a','b']).size()
hdf

gives me

给我

a           b
1           -2.0          1
            -1.9          1
            -1.8          1
            -1.7          2
            -1.6          5
            -1.5         10
            -1.4          9
            -1.3         21
            -1.2         34
            -1.1         67
            -1.0         65
            -0.9         94
            -0.8        180
            -0.7        242
            -0.6        239
...
187          0.4        22
             0.5        10
188         -0.6         2
            -0.5         2
            -0.4         1
            -0.3         2
            -0.2         5
            -0.1        10
            -0.0        18
             0.1        19
             0.2        20
             0.3        13
             0.4         7
             0.5         5
             0.6         1
Length: 8844, dtype: int64

With that, and your help CT Zhu, I could then do

有了这个,再加上你的帮助 CT Zhu,我就可以了

hdfreset = hdf.reset_index()
hdfreset.columns = ['a', 'b', 'occurrence']
hdfpivot=hdfreset.pivot('a', 'b')

and this finally gave me the correct values to

这最终给了我正确的价值观

X=hdfpivot.columns.levels[1].values
Y=hdfpivot.index.values
Z=hdfpivot.values
Xi,Yi = np.meshgrid(X, Y)
plt.contourf(Yi, Xi, Z, alpha=0.7, cmap=plt.cm.jet);

which leads to this beautiful contourf:

这导致了这个美丽的轮廓:

enter image description here

在此处输入图片说明

回答by CT Zhu

Welcome to SO.

欢迎来到 SO。

It looks quite clear that for each of your 'a' level, the numbers of 'b' levels are not the same, thus I will suggest the following solution:

很明显,对于每个“a”级别,“b”级别的数量都不相同,因此我将建议以下解决方案:

In [44]:

print df #an example, you can get your dataframe in to this by rest_index()
    a  b     value
0   0  1  0.336885
1   0  2  0.276750
2   0  3  0.796488
3   1  1  0.156050
4   1  2  0.401942
5   1  3  0.252651
6   2  1  0.861911
7   2  2  0.914803
8   2  3  0.869331
9   3  1  0.284757
10  3  2  0.488330

[11 rows x 3 columns]
In [45]:
#notice that you will have some 'NAN' values
df=df.pivot('a', 'b', 'value')
In [46]:

X=df.columns.values
Y=df.index.values
Z=df.values
x,y=np.meshgrid(X, Y)
plt.contourf(x, y, Z) #the NAN will be plotted as white spaces
Out[46]:
<matplotlib.contour.QuadContourSet instance at 0x1081385a8>

enter image description here

在此处输入图片说明