使用 matplotlib 在一个子图中绘制来自 Pandas DataFrame 的两个直方图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51749208/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Plotting two histograms from a pandas DataFrame in one subplot using matplotlib
提问by Max2603
I have a pandas dataframe like the following:
我有一个如下所示的Pandas数据框:
df = pd.DataFrame({ 'a_wood' : np.random.randn(100),
'a_grassland' : np.random.randn(100),
'a_settlement' : np.random.randn(100),
'b_wood' : np.random.randn(100),
'b_grassland' : np.random.randn(100),
'b_settlement' : np.random.randn(100)})
and I want to create histograms of this data with every dataframe header in one subplot.
我想用一个子图中的每个数据帧标题创建这些数据的直方图。
fig, ax = plt.subplots(2, 3, sharex='col', sharey='row')
m=0
for i in range(2):
for j in range(3):
df.hist(column = df.columns[m], bins = 12, ax=ax[i,j], figsize=(20, 18))
m+=1
For that the previous code works perfectly but now I want to combine eyery a and b header (e.g. "a_woods" and "b-woods") to one subplot so there would be just three histograms. I tried assigning two columns to df.columns[[m,m+3]]
but this doesn't work. I also have an index column with strings like "day_1", which I want to be on the x-axis. Can someone help me?
为此,以前的代码工作得很好,但现在我想将 a 和 b 标题(例如“a_woods”和“b-woods”)组合到一个子图中,因此只有三个直方图。我尝试分配两列,df.columns[[m,m+3]]
但这不起作用。我还有一个索引列,其中包含像“day_1”这样的字符串,我希望它位于 x 轴上。有人能帮我吗?
回答by Alex
I don't know if I understood your question correctly, but something like this can combine the plots. You might want to play around a little with the alpha and change the headers.
我不知道我是否正确理解了你的问题,但这样的事情可以结合这些情节。您可能想稍微使用 alpha 并更改标题。
#NOTE that you might want to specify your bins or they wont line up exactly
fig, ax = plt.subplots(1, 3, sharex='col', sharey='row', figsize=(20, 18))
n = 3
for j in range(n):
df.hist(column=df.columns[j], bins=12, ax=ax[j], alpha=0.5, color='red')
df.hist(column=df.columns[j+n], bins=12, ax=ax[j], alpha=0.5, color='blue')
ax[j].set_title(df.columns[j][2:])
To plot them both next to eachother, try this:
要将它们彼此相邻绘制,请尝试以下操作:
#This example doesnt have the issue with different binsizes within one subplot
fig, ax = plt.subplots(1, 3, sharex='col', sharey='row', figsize=(20, 18))
n = 3
colors = ['red', 'blue']
axes = ax.flatten()
for i,j in zip(range(n), axes):
j.hist([df.iloc[:,i], df.iloc[:,i+n]], bins=12, color=colors)
j.set_title(df.columns[i][2:])
回答by Eric
you want something that loops through each column and plot its data in histogram, right? I can suggest you to make few modifications that you can re-use in future code, before giving the code there are few useful tips that are helpful,
您想要循环遍历每一列并将其数据绘制在直方图中的东西,对吗?我可以建议你做一些修改,你可以在以后的代码中重用,在给出代码之前,有一些有用的提示是有帮助的,
- One must be aware that dataframes have attribute that can be used to loop through, for instance, the attribute .columns let have the list of columns
- Also when plotting, I noticed that using directly the coordinates on the grid won't let your code be adaptable, so you need to 'flatten' your grid coordinates, hence the use of
ax.ravel()
which enable this. enumerate()
is always useful to loop through an object while making available the ith element and its index at the same time.- Understanding subplots in python is tricky at the beginning, so reading other people code is really helpful, I strongly advise you look at the plot done in the exemples for scikit functions (it helped a lot)
- 必须注意,数据框具有可用于循环的属性,例如,属性 .columns 让具有列列表
- 此外,在绘图时,我注意到直接使用网格上的坐标不会让您的代码具有适应性,因此您需要“展平”您的网格坐标,因此使用
ax.ravel()
which 可以实现这一点。 enumerate()
在循环访问对象的同时使第 i 个元素及其索引可用总是有用的。- 一开始理解python中的子图很棘手,所以阅读其他人的代码真的很有帮助,我强烈建议你看一下scikit函数示例中的图(它有很大帮助)
here is my code proposal :
这是我的代码提案:
fig, ax = plt.subplots(1, 3, sharex='col', sharey='row', figsize=(12,7))
ax = ax.ravel()
# this method helps you to go from a 2x3 array coordinates to
# 1x6 array, it will be helpful to use as below
for idx in range(3):
ax[idx].hist(df.iloc[:,idx], bins=12, alpha=0.5)
ax[idx].hist(df.iloc[:,idx+3], bins=12, alpha=0.5)
ax[idx].set_title(df.columns[idx]+' with '+df.columns[idx+3])
ax[idx].legend(loc='upper left')
I hope this is helpful, feel free to ask me question if you need more details :)
我希望这是有帮助的,如果您需要更多详细信息,请随时问我问题:)
NOTE : re-used Alex's answer to edit my answer. Also check this matplotlib documentationfor more details. In this specific case point 3 is no more relevant.
注意:重新使用亚历克斯的答案来编辑我的答案。另请查看此matplotlib 文档以获取更多详细信息。在此特定情况下,第 3 点不再相关。