pandas 如何使用熊猫数据框中的列标记气泡图/散点图?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41481153/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to label bubble chart/scatter plot with column from pandas dataframe?
提问by Rachel
I am trying to label a scatter/bubble chart I create from matplotlib with entries from a column in a pandas data frame. I have seen plenty of examples and questions related (see e.g. hereand here). Hence I tried to annotate the plot accordingly. Here is what I do:
我正在尝试标记我从 matplotlib 创建的散点图/气泡图,其中包含来自 Pandas 数据框中一列的条目。我看过很多相关的例子和问题(参见例如这里和这里)。因此,我试图相应地注释情节。这是我所做的:
import matplotlib.pyplot as plt
import pandas as pd
#example data frame
x = [5, 10, 20, 30, 5, 10, 20, 30, 5, 10, 20, 30]
y = [100, 100, 200, 200, 300, 300, 400, 400, 500, 500, 600, 600]
s = [5, 10, 20, 30, 5, 10, 20, 30, 5, 10, 20, 30]
users =['mark', 'mark', 'mark', 'rachel', 'rachel', 'rachel', 'jeff', 'jeff', 'jeff', 'lauren', 'lauren', 'lauren']
df = pd.DataFrame(dict(x=x, y=y, users=users)
#my attempt to plot things
plt.scatter(x_axis, y_axis, s=area, alpha=0.5)
plt.xlabel(xlabel)
plt.ylabel(ylabel)
plt.annotate(df.users, xy=(x,y))
plt.show()
I use a pandas datframe and I somehow get a KeyError- so I guess a dict()
object is expected? Is there any other way to label the data using with entries from a pandas data frame?
我使用了一个Pandas数据框,但不知何故我得到了一个 KeyError- 所以我猜一个dict()
对象是预期的?有没有其他方法可以使用 Pandas 数据框中的条目来标记数据?
回答by jezrael
You can use DataFrame.plot.scatter
and then select in loop by DataFrame.iat
:
您可以使用DataFrame.plot.scatter
,然后通过DataFrame.iat
以下方式循环选择:
ax = df.plot.scatter(x='x', y='y', alpha=0.5)
for i, txt in enumerate(df.users):
ax.annotate(txt, (df.x.iat[i],df.y.iat[i]))
plt.show()
回答by Rutger Kassies
Jezreal's answer is fine, but i will post this just to show what i meant with df.iterrows
in the other thread.
Jezreal 的回答很好,但我会发布这个只是为了展示我df.iterrows
在另一个线程中的意思。
I'm afraid you have to put the scatter (or plot) command in the loop as well if you want to have a dynamic size.
如果您想要动态大小,恐怕您也必须将 scatter (或 plot)命令放入循环中。
df = pd.DataFrame(dict(x=x, y=y, s=s, users=users))
fig, ax = plt.subplots(facecolor='w')
for key, row in df.iterrows():
ax.scatter(row['x'], row['y'], s=row['s']*5, alpha=.5)
ax.annotate(row['users'], xy=(row['x'], row['y']))