pandas 从熊猫数据框中注释散点图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15943945/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Annotate scatterplot from a pandas dataframe
提问by SilviaDomingo
I am using pandas and matplotlib to visualize this dataframe
我正在使用 Pandas 和 matplotlib 来可视化这个数据框
HDD EnergyPerSquareMeter Year
0 3333.6 23.11 1997
1 3349.6 24.30 1998
2 3319.5 24.78 1999
3 3059.1 22.01 2000
4 3287.5 24.17 2001
5 3054.9 20.01 2002
6 3330.0 21.25 2003
7 3307.3 19.22 2004
8 3401.4 18.31 2005
9 3261.6 20.40 2006
10 3212.8 15.34 2008
11 3231.2 15.95 2009
12 3570.1 15.79 2010
13 2995.3 13.88 2011
And I would like to plot EnergyPerSquareMeter as scatterplot (with the x-axis=HDD) and annotate the points with the year.
我想将 EnergyPerSquareMeter 绘制为散点图(x 轴 = HDD)并用年份注释点。
I did this:
我这样做了:
ax =EnergyvsHDD.plot(x='HDD', y='EnergyPerSquareMeter', marker="o" , linestyle='None', figsize=(12,8))
for i, txt in enumerate(EnergyvsHDD['Year']):
ax.annotate(txt, (x[i],y[i]), size=10, xytext=(0,0), ha='right', textcoords='offset points')
The outcome is:
结果是:
The annotated text of the years doesn′t appear near the points. What am I doing wrong?
年份的注释文本不会出现在点附近。我究竟做错了什么?
UPDATED
更新
Using this code:
使用此代码:
def label_point_orig(x, y, val, ax):
a = pd.concat({'x': x, 'y': y, 'val': val}, axis=1)
print a
for i, point in a.iterrows():
ax.text(point['x'], point['y'], str(point['val']))
And then:
进而:
ax = EnergyvsHDD.set_index('HDD')['EnergyPerSquareMeter'].plot(style='o')
label_point_orig(EnergyvsHDD.HDD, EnergyvsHDD.EnergyPerSquareMeter, EnergyvsHDD.Year, ax)
draw()
The points don't appear in the proper place:
这些点没有出现在正确的位置:
Although using this code it works:
虽然使用此代码它的工作原理:
plt.scatter(list(EnergyvsHDD.HDD), list(EnergyvsHDD.EnergyPerSquareMeter))
label_point_orig(EnergyvsHDD.HDD, EnergyvsHDD.EnergyPerSquareMeter, EnergyvsHDD.Year, plt)
draw()
Does anybody know why?
有人知道为什么吗?
回答by Dan Allan
This answer of mine gives a working example Annotate data points while plotting from Pandas DataFrame
我的这个答案给出了一个工作示例Annotate data points while plotting from Pandas DataFrame
which does work on your dataset
这确实适用于您的数据集
The code you've shown is not self-contained. What are x
and y
? Hopefully, they are Series corresponding to the correct columns of your DataFrame. My best guess is that they not what you think they are. It would be safer to use the columns from your EnergyvsHDD
DataFrame directly. (See my linked answer.)
您显示的代码不是独立的。什么是x
和y
?希望它们是对应于 DataFrame 正确列的系列。我最好的猜测是他们不是你认为的那样。EnergyvsHDD
直接使用DataFrame 中的列会更安全。(请参阅我的链接答案。)