pandas 在同一图上将数据框绘制为“hist”和“kde”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39987071/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:11:07  来源:igfitidea点击:

Plotting a dataframe as both a 'hist' and 'kde' on the same plot

pythonpandasmatplotlibplotdataframe

提问by Lukasz

I have a pandas dataframewith user information. I would like to plot the age of users as both a kind='kde'and on kind='hist'on the same plot. At the moment I am able to have the two separate plots. The dataframe resembles:

我有一个dataframe带有用户信息的Pandas。我想在同一个图kind='kde'上将用户的年龄绘制为 a和 on kind='hist'。目前我能够拥有两个独立的地块。数据框类似于:

member_df=    
user_id    Age
1          23
2          34
3          63 
4          18
5          53  
...

using

使用

ax1 = plt.subplot2grid((2,3), (0,0))
member_df.Age.plot(kind='kde', xlim=[16, 100])
ax1.set_xlabel('Age')

ax2 = plt.subplot2grid((2,3), (0,1))
member_df.Age.plot(kind='hist', bins=40)
ax2.set_xlabel('Age')

ax3 = ...

I understand that the kind='kde'will give me frequencies for the y-axis whereas kind='kde'will give a cumulative distribution, but is there a way to combine both and have the y-axis be represented by the frequencies?

我知道这kind='kde'会给我 y 轴的频率,而 kind='kde'会给我一个累积分布,但是有没有办法将两者结合起来并使 y 轴由频率表示?

回答by piRSquared

pd.DataFrame.plot()returns the axit is plotting to. You can reuse this for other plots.

pd.DataFrame.plot()返回ax它正在绘制的。您可以将其重用于其他绘图。

Try:

尝试:

ax = member_df.Age.plot(kind='kde')
member_df.Age.plot(kind='hist', bins=40, ax=ax)
ax.set_xlabel('Age')

example
I plot histfirst to put in background
Also, I put kdeon secondary_yaxis

例如
我的情节hist首先摆在后台
另外,我把kdesecondary_y线

import pandas as pd
import numpy as np


np.random.seed([3,1415])
df = pd.DataFrame(np.random.randn(100, 2), columns=list('ab'))

ax = df.a.plot(kind='hist')
df.a.plot(kind='kde', ax=ax, secondary_y=True)

enter image description here

在此处输入图片说明



response to comment
using subplot2grid. just reuse ax1

回应评论
使用subplot2grid。只是重用ax1

import pandas as pd
import numpy as np

ax1 = plt.subplot2grid((2,3), (0,0))

np.random.seed([3,1415])
df = pd.DataFrame(np.random.randn(100, 2), columns=list('ab'))

df.a.plot(kind='hist', ax=ax1)
df.a.plot(kind='kde', ax=ax1, secondary_y=True)

enter image description here

在此处输入图片说明

回答by Javi

In case you want it for all the columns of your dataframe:

如果您希望数据框的所有列都使用它:

fig, ax = plt.subplots(8,3, figsize=(20, 50)) 
# you can change the distribution, I had 22 columns, so 8x3 is fine to me
fig.subplots_adjust(hspace = .2, wspace=.2, )

ax = ax.ravel()

for i in range(len(I_df.columns)):
    ax[i] = I_df.iloc[:,i].plot(kind='hist', ax=ax[i])
    ax[i] = I_df.iloc[:,i].plot(kind='kde', ax=ax[i], secondary_y=True)
    plt.title(I_df.columns[i])

I hope it helps :)

我希望它有帮助:)

回答by jedi

It is better and even simpler to use seaborn.displot. Prior proposed solutions had KDE plot appear a little "shifted up" for me. seaborn.distplotaccurately lined up zeros between hist and kde plots.
import seaborn as sns sns.displot(df.a)

使用seaborn.displot更好,甚至更简单。之前提出的解决方案让 KDE 情节对我来说有点“上移”。seaborn.distplot在 hist 和 kde 图之间准确地排列零。
import seaborn as sns sns.displot(df.a)