Python 熊猫 - 两列的直方图?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31571830/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas - histogram from two columns?
提问by mnowotka
I have this data:
我有这个数据:
data = pd.DataFrame().from_dict([r for r in response])
print data
_id total
0 213 1
1 194 3
2 205 156
...
Now, if I call:
现在,如果我打电话:
data.hist()
I will get two separate histograms, one for each column. This is not what I want. What I want is a single histogram made using those two columns, where one column is interpreted as a value and another one as a number of occurrences of this value. What should I do to generate such a histogram?
我会得到两个单独的直方图,每列一个。这不是我想要的。我想要的是使用这两列制作的单个直方图,其中一列被解释为一个值,另一列被解释为该值的出现次数。我应该怎么做才能生成这样的直方图?
I tried:
我试过:
data.hist(column="_id", by="total")
But this generates even more (empty) histograms with error message.
但这会生成更多(空)带有错误消息的直方图。
采纳答案by Ami Tavory
You can always drop to the lower-level matplotlib.hist
:
您可以随时降到较低级别matplotlib.hist
:
from matplotlib.pyplot import hist
df = pd.DataFrame({
'_id': np.random.randn(100),
'total': 100 * np.random.rand()
})
hist(df._id, weights=df.total)
回答by dermen
Since you already have the bin frequencies computed (the total
column), just use pandas.DataFrame.plot
由于您已经计算了 bin 频率(total
列),因此只需使用pandas.DataFrame.plot
data.plot( x='_id', y='total', kind='hist')