pandas 在同一图上将数据框绘制为“hist”和“kde”

Question

提问by Lukasz

I have a pandas dataframewith user information. I would like to plot the age of users as both a kind='kde'and on kind='hist'on the same plot. At the moment I am able to have the two separate plots. The dataframe resembles:

我有一个dataframe带有用户信息的Pandas。我想在同一个图kind='kde'上将用户的年龄绘制为 a和 on kind='hist'。目前我能够拥有两个独立的地块。数据框类似于：

member_df=    
user_id    Age
1          23
2          34
3          63 
4          18
5          53  
...

using

使用

ax1 = plt.subplot2grid((2,3), (0,0))
member_df.Age.plot(kind='kde', xlim=[16, 100])
ax1.set_xlabel('Age')

ax2 = plt.subplot2grid((2,3), (0,1))
member_df.Age.plot(kind='hist', bins=40)
ax2.set_xlabel('Age')

ax3 = ...

I understand that the kind='kde'will give me frequencies for the y-axis whereas kind='kde'will give a cumulative distribution, but is there a way to combine both and have the y-axis be represented by the frequencies?

我知道这kind='kde'会给我 y 轴的频率，而 kind='kde'会给我一个累积分布，但是有没有办法将两者结合起来并使 y 轴由频率表示？

Answer 1

回答by piRSquared

pd.DataFrame.plot()returns the axit is plotting to. You can reuse this for other plots.

pd.DataFrame.plot()返回ax它正在绘制的。您可以将其重用于其他绘图。

Try:

尝试：

ax = member_df.Age.plot(kind='kde')
member_df.Age.plot(kind='hist', bins=40, ax=ax)
ax.set_xlabel('Age')

example
I plot histfirst to put in background
Also, I put kdeon secondary_yaxis

例如
我的情节hist首先摆在后台
另外，我把kde上secondary_y线

import pandas as pd
import numpy as np


np.random.seed([3,1415])
df = pd.DataFrame(np.random.randn(100, 2), columns=list('ab'))

ax = df.a.plot(kind='hist')
df.a.plot(kind='kde', ax=ax, secondary_y=True)

response to comment
using subplot2grid. just reuse ax1

回应评论
使用subplot2grid。只是重用ax1

import pandas as pd
import numpy as np

ax1 = plt.subplot2grid((2,3), (0,0))

np.random.seed([3,1415])
df = pd.DataFrame(np.random.randn(100, 2), columns=list('ab'))

df.a.plot(kind='hist', ax=ax1)
df.a.plot(kind='kde', ax=ax1, secondary_y=True)

Answer 2

回答by Javi

In case you want it for all the columns of your dataframe:

如果您希望数据框的所有列都使用它：

fig, ax = plt.subplots(8,3, figsize=(20, 50)) 
# you can change the distribution, I had 22 columns, so 8x3 is fine to me
fig.subplots_adjust(hspace = .2, wspace=.2, )

ax = ax.ravel()

for i in range(len(I_df.columns)):
    ax[i] = I_df.iloc[:,i].plot(kind='hist', ax=ax[i])
    ax[i] = I_df.iloc[:,i].plot(kind='kde', ax=ax[i], secondary_y=True)
    plt.title(I_df.columns[i])

I hope it helps :)

我希望它有帮助:)

Answer 3

回答by jedi

It is better and even simpler to use seaborn.displot. Prior proposed solutions had KDE plot appear a little "shifted up" for me. seaborn.distplotaccurately lined up zeros between hist and kde plots.
import seaborn as sns sns.displot(df.a)

使用seaborn.displot更好，甚至更简单。之前提出的解决方案让 KDE 情节对我来说有点“上移”。seaborn.distplot在 hist 和 kde 图之间准确地排列零。
import seaborn as sns sns.displot(df.a)

pandas 在同一图上将数据框绘制为“hist”和“kde”

提问by Lukasz

回答by piRSquared

回答by Javi

回答by jedi

相关推荐

最近更新

标签

pandas 在同一图上将数据框绘制为“hist”和“kde”

提问by Lukasz

回答by piRSquared

回答by Javi

回答by jedi

相关推荐

将不同目录中的多个 .csv 文件读入 Pandas DataFrame

pandas Python 3 statsmodels Logit ValueError：在进入 DLASCL 参数编号 5 时有一个非法值

Pandas：创建没有按字母顺序自动排序列名的数据框

将 Pandas 转换为 Spark 时出现类型错误

相关推荐

最近更新

标签