pandas 包括 NaN 值的 python 熊猫直方图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32239093/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pandas histogram plot including NaN values
提问by Hari
I wanted to draw a histogram of some data. sorry that I could not attach a sample histogram as I don't have enough reputation. Hope that my description of the problem I am facing will be understood by you. I am using python pandas and I realize that any NaN value is treated as a 0 by pandas. Is there any method that I can use to include the count of Nan value in the histogram? What I mean is that the x-axis should have the NaN value as well. Please help... Thank you very much.
我想绘制一些数据的直方图。抱歉,我没有足够的声誉,因此无法附上样本直方图。希望我对我面临的问题的描述会被你理解。我正在使用 python pandas,我意识到任何 NaN 值都被Pandas视为 0。有没有什么方法可以用来在直方图中包含 Nan 值的计数?我的意思是 x 轴也应该有 NaN 值。请帮助...非常感谢。
回答by Monique Hendriks
I was looking for the same thing. I ended up with the following solution:
我正在寻找同样的东西。我最终得到了以下解决方案:
figure = plt.figure(figsize=(6,9), dpi=100);
graph = figure.add_subplot(111);
freq = pandas.value_counts(data)
bins = freq.index
x=graph.bar(bins, freq.values) #gives the graph without NaN
graphmissing = figure.add_subplot(111)
y = graphmissing.bar([0], freq[numpy.NaN]) #gives a bar for the number of missing values at x=0
figure.show()
This gave me a histogram with a column at 0 showing the number of missing values in the data.
这给了我一个直方图,其中一列在 0 处,显示数据中缺失值的数量。
回答by maleckicoa
Did you try replacing NaN with some other unique value and then plot the histogram?
您是否尝试用其他一些唯一值替换 NaN 然后绘制直方图?
x= some unique value
plt.hist(df.replace(np.nan, x)
回答by Mark C. F. Sousa
As pointed out by Sreeram TP, it is possible to use the argument dropna=False in the function value_counts to include the counts of NaNs.
正如Sreeram TP所指出的,可以在函数 value_counts 中使用参数 dropna=False 来包含 NaN 的计数。
df = pd.DataFrame({'feature1': [1, 2, 2, 4, 3, 2, 3, 4, np.NaN],
'feature2': [4, 4, 3, 4, 1, 4, 3, np.NaN, np.NaN]})
# Calculates the histogram for feature1
counts = df['feature1'].value_counts(dropna=False)
counts.plot.bar(title='feat1', grid=True)
I can not insert images. So, here is the result: image plot here
我无法插入图像。所以,这是结果: 这里的图像图

