来自两个 Pandas 数据框的分组条形图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42532319/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Grouped bar chart from two pandas data frames
提问by Fourier
I have two data frames containing different values but the same structure:
我有两个包含不同值但结构相同的数据框:
df1 =
df1 =
0 1 2 3 4
D 0.003073 0.014888 0.155815 0.826224 NaN
E 0.000568 0.000435 0.000967 0.002956 0.067249
df2 =
df2 =
0 1 2 3 4
D 0.746689 0.185769 0.060107 0.007435 NaN
E 0.764552 0.000000 0.070288 0.101148 0.053499
I want to plot both data frames in a single grouped bar chart. In addition, each row (index) should be a subplot.
我想在单个分组条形图中绘制两个数据框。此外,每一行(索引)都应该是一个子图。
This can be easily achieved for one of them using pandas directly:
对于其中一个直接使用Pandas可以轻松实现这一点:
df1.T.plot(kind="bar", subplots=True, layout=(2,1), width=0.7, figsize=(10,10), sharey=True)
I tried to join them using
我试图加入他们使用
pd.concat([df1, df2], axis=1)
which results in a new dataframe:
这会产生一个新的数据框:
0 1 2 3 4 0 1 2 3 4
D 0.003073 0.014888 0.155815 0.826224 NaN 0.746689 0.185769 0.060107 0.007435 NaN
E 0.000568 0.000435 0.000967 0.002956 0.067249 0.764552 0.000000 0.070288 0.101148 0.053499
However, plotting the data frame with the above method will not group the bars per column but rather treats them separately. Per subplot this results in a x-axis with duplicated ticks in order of the columns, e.g. 0,1,2,3,4,0,1,2,3,4
.
但是,使用上述方法绘制数据框不会对每列的条形进行分组,而是将它们分开处理。对于每个子图,这会导致 x 轴按列的顺序具有重复的刻度,例如0,1,2,3,4,0,1,2,3,4
.
Any ideas?
有任何想法吗?
采纳答案by Moritz
It is not exactly clear how the data is organized. Pandas and seaborn usually expect tidy datasets. Because you do transpose the data prior to plotting I assume you have two variable (A and B) and four observations (e.g. measurements)
尚不清楚数据是如何组织的。Pandas 和 seaborn 通常需要整洁的数据集。因为您确实在绘图之前转置了数据,所以我假设您有两个变量(A 和 B)和四个观察值(例如测量值)
df1 = pd.DataFrame.from_records(np.random.rand(2,4), index = ['A','B'])
df2 = pd.DataFrame.from_records(np.random.rand(2,4), index = ['A','B'])
df1.T
Maybe this is close to what you want:
也许这接近你想要的:
df4 = pd.concat([df1.T, df2.T], axis=0, ignore_index=False)
df4['col'] = (len(df1.T)*(0,) + len(df2.T)*(1,))
df4.reset_index(inplace=True)
df4
using seaborns facet grid allows for convenient plotting:
使用 seaborns facet grid 可以方便地绘图:
sns.factorplot(x='index', y='A', hue='col', kind='bar', data=df4)