Pandas 中的列直方图

Question

提问by Amelio Vazquez-Reina

Say I have a dataframe like the following:

假设我有一个如下所示的数据框：

     A B C D
s1   1 2 4 2
s2   2 1 4 3
s3   1 4 1 3

I would like to get a bar plot that shows the histogram of values per column. That is, a bar plot that shows histograms per columnnext to each other in the x axis, with spacing between the histograms (columns). In other words, it would be a two-level bar chart, where for each column in the dataframe we have bars representing the histogram of the column.

我想得到一个条形图，显示每列值的直方图。即，条形图显示每列在 x 轴上相邻的直方图，直方图（列）之间有间距。换句话说，它将是一个两级条形图，其中对于数据框中的每一列，我们都有表示该列直方图的条形图。

In case it matters, we can assume that the number of possible values each column has is known and constant for every column (e.g. range [0,5])

万一重要，我们可以假设每列具有的可能值的数量是已知的并且对于每列都是恒定的（例如 range [0,5]）

When I try doing:

当我尝试做：

df.plot(kind='bar')

I get something completely different from what I want (the x ticks correspond to the rows, instead of [columns: [value0, value1, valueN]). The closest "in spirit" to what I want is is:

我得到的东西与我想要的完全不同（x 刻度对应于行，而不是 [ columns: [value0, value1, valueN]）。最接近我想要的“精神”是：

df.plot(kind='density')

But I am looking for a histogram-like description per column, more than an overlay of PDFs.

但我正在寻找每列类似直方图的描述，而不仅仅是 PDF 的叠加。

Update

更新

Hopefully an example helps. I am looking for something like this plot below, (code here) but instead of showing two scores per group, it would show a histogram of values per column in my dataframe:

希望一个例子有帮助。我正在寻找类似下面这个图的东西，（这里的代码）但不是每组显示两个分数，而是在我的数据框中显示每列值的直方图：

enter image description here

在此处输入图片说明

Answer 1

回答by BKay

This presentation doesn't rescale, it horizontally translates the individual histograms so that they don't overlap and then labels the X-axis with the column names (at median values) rather than represent scale.

此演示文稿不会重新缩放，它会水平平移单个直方图，使它们不会重叠，然后使用列名称（中值）标记 X 轴，而不是表示比例。

from pandas import DataFrame
from numpy.random import randn
sample = 1000
df = DataFrame(randn(sample, 8))
accum1 = 0
accum2 = 0
spacer = 1
MyTics = []    
for colname in df.columns:
    TransformedValues = df[colname] - accum1 + accum2
    MyTics.extend([TransformedValues.median()])
    axs = (TransformedValues).hist()
    accum1 += df[colname].min()  
    accum2 += df[colname].max() + spacer    
axs.set_xticks(MyTics)
axs.set_xticklabels(df.columns)

Resulting multi-historgram picture

生成的多直方图图片

Answer 2

回答by weemattisnot

There is numpy's histogramfunction, and matplotlib's histogram plotting function 'hist'.

有 numpy 的直方图函数和 matplotlib 的直方图绘图函数 'hist'。

Pandas 中的列直方图

提问by Amelio Vazquez-Reina

Update

更新

回答by BKay

回答by weemattisnot

相关推荐

最近更新

标签

Pandas 中的列直方图

提问by Amelio Vazquez-Reina

Update

更新

回答by BKay

回答by weemattisnot

相关推荐

Python Pandas：从多级列索引中删除一列？

Pandas - 合并两个具有相同列名的 DataFrame

Pandas - 删除只有 NaN 值的行

pandas 合并具有相同列值的连续行

相关推荐

最近更新

标签