从 Pandas Groupby 数据框创建轮廓图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24032282/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create Contour Plot from Pandas Groupby Dataframe
提问by Balzer82
I have following Pandas Dataframe:
我有以下Pandas数据框:
In [66]: hdf.size()
Out[66]:
a b
0 0.0 21004
0.1 119903
0.2 186579
0.3 417349
0.4 202723
0.5 100906
0.6 56386
0.7 6080
0.8 3596
0.9 2391
1.0 1963
1.1 1730
1.2 1663
1.3 1614
1.4 1309
...
186 0.2 15
0.3 9
0.4 21
0.5 4
187 0.2 3
0.3 10
0.4 22
0.5 10
188 0.0 11
0.1 19
0.2 20
0.3 13
0.4 7
0.5 5
0.6 1
Length: 4572, dtype: int64
You see, a from 0...188 and b in every group from some value to some value. And as the designated Z-value, the count of the occurence of the pair a/b.
你看,a 从 0...188 和 b 在每个组中从某个值到某个值。并且作为指定的 Z 值,对 a/b 的出现次数。
How to get a countour or heatmap plot out of the grouped dataframe?
如何从分组数据框中获取计数或热图?
I have this (asking for the ?):
我有这个(要求?):
numcols, numrows = 30, 30
xi = np.linspace(0, 200, numcols)
yi = np.linspace(0, 6, numrows)
xi, yi = np.meshgrid(xi, yi)
zi = griddata(?, ?, hdf.size().values, xi, yi)
How to get the x and y values out of the Groupby object and plot a contour?
如何从 Groupby 对象中获取 x 和 y 值并绘制轮廓?
回答by Balzer82
Thanks a lot! My fault was, that I did not realize, that I have to apply some function to the groupby dataframe, like .size(), to work with it...
非常感谢!我的错是,我没有意识到,我必须对 groupby 数据框应用一些函数,例如.size(),才能使用它...
hdf = aggdf.groupby(['a','b']).size()
hdf
gives me
给我
a b
1 -2.0 1
-1.9 1
-1.8 1
-1.7 2
-1.6 5
-1.5 10
-1.4 9
-1.3 21
-1.2 34
-1.1 67
-1.0 65
-0.9 94
-0.8 180
-0.7 242
-0.6 239
...
187 0.4 22
0.5 10
188 -0.6 2
-0.5 2
-0.4 1
-0.3 2
-0.2 5
-0.1 10
-0.0 18
0.1 19
0.2 20
0.3 13
0.4 7
0.5 5
0.6 1
Length: 8844, dtype: int64
With that, and your help CT Zhu, I could then do
有了这个,再加上你的帮助 CT Zhu,我就可以了
hdfreset = hdf.reset_index()
hdfreset.columns = ['a', 'b', 'occurrence']
hdfpivot=hdfreset.pivot('a', 'b')
and this finally gave me the correct values to
这最终给了我正确的价值观
X=hdfpivot.columns.levels[1].values
Y=hdfpivot.index.values
Z=hdfpivot.values
Xi,Yi = np.meshgrid(X, Y)
plt.contourf(Yi, Xi, Z, alpha=0.7, cmap=plt.cm.jet);
which leads to this beautiful contourf:
这导致了这个美丽的轮廓:


回答by CT Zhu
Welcome to SO.
欢迎来到 SO。
It looks quite clear that for each of your 'a' level, the numbers of 'b' levels are not the same, thus I will suggest the following solution:
很明显,对于每个“a”级别,“b”级别的数量都不相同,因此我将建议以下解决方案:
In [44]:
print df #an example, you can get your dataframe in to this by rest_index()
a b value
0 0 1 0.336885
1 0 2 0.276750
2 0 3 0.796488
3 1 1 0.156050
4 1 2 0.401942
5 1 3 0.252651
6 2 1 0.861911
7 2 2 0.914803
8 2 3 0.869331
9 3 1 0.284757
10 3 2 0.488330
[11 rows x 3 columns]
In [45]:
#notice that you will have some 'NAN' values
df=df.pivot('a', 'b', 'value')
In [46]:
X=df.columns.values
Y=df.index.values
Z=df.values
x,y=np.meshgrid(X, Y)
plt.contourf(x, y, Z) #the NAN will be plotted as white spaces
Out[46]:
<matplotlib.contour.QuadContourSet instance at 0x1081385a8>



