来自两个 Pandas 数据框的分组条形图

Question

提问by Fourier

I have two data frames containing different values but the same structure:

我有两个包含不同值但结构相同的数据框：

df1 =

         0         1         2         3         4 
D  0.003073  0.014888  0.155815  0.826224       NaN
E  0.000568  0.000435  0.000967  0.002956  0.067249

df2 =

     0         1         2         3         4
D  0.746689  0.185769  0.060107  0.007435       NaN   
E  0.764552  0.000000  0.070288  0.101148  0.053499

I want to plot both data frames in a single grouped bar chart. In addition, each row (index) should be a subplot.

我想在单个分组条形图中绘制两个数据框。此外，每一行（索引）都应该是一个子图。

This can be easily achieved for one of them using pandas directly:

对于其中一个直接使用Pandas可以轻松实现这一点：

df1.T.plot(kind="bar", subplots=True, layout=(2,1), width=0.7, figsize=(10,10), sharey=True)

I tried to join them using

我试图加入他们使用

pd.concat([df1, df2], axis=1)

which results in a new dataframe:

这会产生一个新的数据框：

         0         1         2         3         4         0         1         2         3         4
D  0.003073  0.014888  0.155815  0.826224       NaN  0.746689  0.185769  0.060107  0.007435       NaN
E  0.000568  0.000435  0.000967  0.002956  0.067249  0.764552  0.000000  0.070288  0.101148  0.053499

However, plotting the data frame with the above method will not group the bars per column but rather treats them separately. Per subplot this results in a x-axis with duplicated ticks in order of the columns, e.g. 0,1,2,3,4,0,1,2,3,4.

但是，使用上述方法绘制数据框不会对每列的条形进行分组，而是将它们分开处理。对于每个子图，这会导致 x 轴按列的顺序具有重复的刻度，例如0,1,2,3,4,0,1,2,3,4.

Any ideas?

有任何想法吗？

Answer 1

采纳答案by Moritz

It is not exactly clear how the data is organized. Pandas and seaborn usually expect tidy datasets. Because you do transpose the data prior to plotting I assume you have two variable (A and B) and four observations (e.g. measurements)

尚不清楚数据是如何组织的。Pandas 和 seaborn 通常需要整洁的数据集。因为您确实在绘图之前转置了数据，所以我假设您有两个变量（A 和 B）和四个观察值（例如测量值）

df1 = pd.DataFrame.from_records(np.random.rand(2,4), index = ['A','B'])
df2 = pd.DataFrame.from_records(np.random.rand(2,4), index = ['A','B'])

df1.T

Maybe this is close to what you want:

也许这接近你想要的：

df4 = pd.concat([df1.T, df2.T], axis=0, ignore_index=False)
df4['col'] = (len(df1.T)*(0,) + len(df2.T)*(1,))
df4.reset_index(inplace=True)
df4

using seaborns facet grid allows for convenient plotting:

使用 seaborns facet grid 可以方便地绘图：

sns.factorplot(x='index', y='A', hue='col', kind='bar', data=df4)

来自两个 Pandas 数据框的分组条形图

提问by Fourier

采纳答案by Moritz

相关推荐

最近更新

标签

来自两个 Pandas 数据框的分组条形图

提问by Fourier

采纳答案by Moritz

相关推荐

Pandas 日期时间自定义错误：“系列对象不可调用”

在 Pandas 数据帧上使用 .replace() 方法时字典中的重叠键

pandas Python 代码将两列相乘，然后用值创建新列

来自 Pandas DataFrame 的烛台图中的重叠日期

相关推荐

最近更新

标签