pandas 熊猫中按类别的散点图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31328526/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
scatter plot by category in pandas
提问by jason adams
This has been troubling me for the past 30 minutes. What I'd like to do is to scatter plot by category. I took a look at the documentation, but I haven't been able to find the answer there. I looked here, but when I ran that in iPython Notebook, I don't get anything.
在过去的 30 分钟里,这一直困扰着我。我想做的是按类别散点图。我查看了文档,但我无法在那里找到答案。我看了这里,但是当我在 iPython Notebook 中运行它时,我什么也没得到。
Here's my data frame:
这是我的数据框:
time cpu wait category
8 1 0.5 a
9 2 0.2 a
2 3 0.1 b
10 4 0.7 c
3 5 0.2 c
5 6 0.8 b
Ideally, I'd like to have a scatter plot that shows CPU on the x axis, wait on the y axis, and each point on the graph is distinguished by category. So for example, if a=red, b=blue, and c=green then point (1, 0.5) and (2, 0.2) should be red, (3, 0.1) and (6, 0.8) should be blue, etc.
理想情况下,我想要一个散点图,在 x 轴上显示 CPU,在 y 轴上等待,并且图表上的每个点都按类别进行区分。例如,如果 a=red, b=blue, and c=green 那么点 (1, 0.5) 和 (2, 0.2) 应该是红色,(3, 0.1) 和 (6, 0.8) 应该是蓝色等等.
How would I do this with pandas? or matplotlib? whichever does the job.
我将如何用Pandas做到这一点?或matplotlib?无论哪个工作。
采纳答案by Alexander
This is essentially the same answer as @JoeCondron, but a two liner:
这与@JoeCondron 的答案基本相同,但有两个班轮:
cmap = {'a': 'red', 'b': 'blue', 'c': 'yellow'}
df.plot(x='cpu', y='wait', kind='scatter',
colors=[cmap.get(c, 'black') for c in df.category])
If no color is mapped for the category, it defaults to black.
如果没有为类别映射颜色,则默认为黑色。
EDIT:
编辑:
The above works for Pandas 0.14.1. For 0.16.2, 'colors' needs to be changed to 'c':
以上适用于 Pandas 0.14.1。对于 0.16.2,'colors' 需要更改为 'c':
df.plot(x='cpu', y='wait', kind='scatter',
c=[cmap.get(c, 'black') for c in df.category])
回答by JoeCondron
You could do
你可以做
color_map = {'a': 'r', 'b': 'b', 'c': 'y'}
ax = plt.subplot()
x, y = df.cpu, df.wait
colors = df.category.map(color_map)
ax.scatter(x, y, color=colors)
This will give you red for category a, blue for b, yellow for c. So you can past a list of color aliases of the same length as the arrays. You can check out the myriad available colours here : http://matplotlib.org/api/colors_api.html. I don't think the plot method is very useful for scatter plots.
这将为您提供 a 类红色,b 类蓝色,c 类黄色。因此,您可以传递与数组长度相同的颜色别名列表。您可以在此处查看无数可用的颜色:http: //matplotlib.org/api/colors_api.html。我不认为 plot 方法对散点图很有用。
回答by alex314159
I'd create a column with your colors based on category, then do the following, where ax is a matplotlib ax and df is your dataframe:
我会根据类别用您的颜色创建一个列,然后执行以下操作,其中 ax 是 matplotlib ax, df 是您的数据框:
ax.scatter(df['cpu'], df['wait'], marker = '.', c = df['colors'], s = 100)

