来自两个 Pandas 数据框的分组条形图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42532319/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:06:25  来源:igfitidea点击:

Grouped bar chart from two pandas data frames

pythonpandasdataframe

提问by Fourier

I have two data frames containing different values but the same structure:

我有两个包含不同值但结构相同的数据框:

df1 =

df1 =

         0         1         2         3         4 
D  0.003073  0.014888  0.155815  0.826224       NaN
E  0.000568  0.000435  0.000967  0.002956  0.067249  

df2 =

df2 =

     0         1         2         3         4
D  0.746689  0.185769  0.060107  0.007435       NaN   
E  0.764552  0.000000  0.070288  0.101148  0.053499

I want to plot both data frames in a single grouped bar chart. In addition, each row (index) should be a subplot.

我想在单个分组条形图中绘制两个数据框。此外,每一行(索引)都应该是一个子图。

This can be easily achieved for one of them using pandas directly:

对于其中一个直接使用Pandas可以轻松实现这一点:

df1.T.plot(kind="bar", subplots=True, layout=(2,1), width=0.7, figsize=(10,10), sharey=True)

I tried to join them using

我试图加入他们使用

pd.concat([df1, df2], axis=1)

which results in a new dataframe:

这会产生一个新的数据框:

         0         1         2         3         4         0         1         2         3         4
D  0.003073  0.014888  0.155815  0.826224       NaN  0.746689  0.185769  0.060107  0.007435       NaN
E  0.000568  0.000435  0.000967  0.002956  0.067249  0.764552  0.000000  0.070288  0.101148  0.053499

However, plotting the data frame with the above method will not group the bars per column but rather treats them separately. Per subplot this results in a x-axis with duplicated ticks in order of the columns, e.g. 0,1,2,3,4,0,1,2,3,4.

但是,使用上述方法绘制数据框不会对每列的条形进行分组,而是将它们分开处理。对于每个子图,这会导致 x 轴按列的顺序具有重复的刻度,例如0,1,2,3,4,0,1,2,3,4.

Any ideas?

有任何想法吗?

采纳答案by Moritz

It is not exactly clear how the data is organized. Pandas and seaborn usually expect tidy datasets. Because you do transpose the data prior to plotting I assume you have two variable (A and B) and four observations (e.g. measurements)

尚不清楚数据是如何组织的。Pandas 和 seaborn 通常需要整洁的数据集。因为您确实在绘图之前转置了数据,所以我假设您有两个变量(A 和 B)和四个观察值(例如测量值)

df1 = pd.DataFrame.from_records(np.random.rand(2,4), index = ['A','B'])
df2 = pd.DataFrame.from_records(np.random.rand(2,4), index = ['A','B'])

df1.T

enter image description here

在此处输入图片说明

Maybe this is close to what you want:

也许这接近你想要的:

df4 = pd.concat([df1.T, df2.T], axis=0, ignore_index=False)
df4['col'] = (len(df1.T)*(0,) + len(df2.T)*(1,))
df4.reset_index(inplace=True)
df4

enter image description here

在此处输入图片说明

using seaborns facet grid allows for convenient plotting:

使用 seaborns facet grid 可以方便地绘图:

sns.factorplot(x='index', y='A', hue='col', kind='bar', data=df4)

enter image description here

在此处输入图片说明