pandas 熊猫中按类别的散点图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31328526/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:36:15  来源:igfitidea点击:

scatter plot by category in pandas

pythonpandasmatplotlibscatter-plot

提问by jason adams

This has been troubling me for the past 30 minutes. What I'd like to do is to scatter plot by category. I took a look at the documentation, but I haven't been able to find the answer there. I looked here, but when I ran that in iPython Notebook, I don't get anything.

在过去的 30 分钟里,这一直困扰着我。我想做的是按类别散点图。我查看了文档,但我无法在那里找到答案。我看了这里,但是当我在 iPython Notebook 中运行它时,我什么也没得到。

Here's my data frame:

这是我的数据框:

time    cpu   wait    category 
8       1     0.5     a 
9       2     0.2     a
2       3     0.1     b
10      4     0.7     c
3       5     0.2     c
5       6     0.8     b

Ideally, I'd like to have a scatter plot that shows CPU on the x axis, wait on the y axis, and each point on the graph is distinguished by category. So for example, if a=red, b=blue, and c=green then point (1, 0.5) and (2, 0.2) should be red, (3, 0.1) and (6, 0.8) should be blue, etc.

理想情况下,我想要一个散点图,在 x 轴上显示 CPU,在 y 轴上等待,并且图表上的每个点都按类别进行区分。例如,如果 a=red, b=blue, and c=green 那么点 (1, 0.5) 和 (2, 0.2) 应该是红色,(3, 0.1) 和 (6, 0.8) 应该是蓝色等等.

How would I do this with pandas? or matplotlib? whichever does the job.

我将如何用Pandas做到这一点?或matplotlib?无论哪个工作。

采纳答案by Alexander

This is essentially the same answer as @JoeCondron, but a two liner:

这与@JoeCondron 的答案基本相同,但有两个班轮:

cmap = {'a': 'red', 'b': 'blue', 'c': 'yellow'}
df.plot(x='cpu', y='wait', kind='scatter', 
        colors=[cmap.get(c, 'black') for c in df.category])

If no color is mapped for the category, it defaults to black.

如果没有为类别映射颜色,则默认为黑色。

EDIT:

编辑:

The above works for Pandas 0.14.1. For 0.16.2, 'colors' needs to be changed to 'c':

以上适用于 Pandas 0.14.1。对于 0.16.2,'colors' 需要更改为 'c':

df.plot(x='cpu', y='wait', kind='scatter', 
    c=[cmap.get(c, 'black') for c in df.category])

回答by JoeCondron

You could do

你可以做

color_map = {'a': 'r', 'b': 'b', 'c': 'y'}
ax = plt.subplot()
x, y = df.cpu, df.wait
colors = df.category.map(color_map)
ax.scatter(x, y, color=colors)

This will give you red for category a, blue for b, yellow for c. So you can past a list of color aliases of the same length as the arrays. You can check out the myriad available colours here : http://matplotlib.org/api/colors_api.html. I don't think the plot method is very useful for scatter plots.

这将为您提供 a 类红色,b 类蓝色,c 类黄色。因此,您可以传递与数组长度相同的颜色别名列表。您可以在此处查看无数可用的颜色:http: //matplotlib.org/api/colors_api.html。我不认为 plot 方法对散点图很有用。

回答by alex314159

I'd create a column with your colors based on category, then do the following, where ax is a matplotlib ax and df is your dataframe:

我会根据类别用您的颜色创建一个列,然后执行以下操作,其中 ax 是 matplotlib ax, df 是您的数据框:

ax.scatter(df['cpu'], df['wait'], marker = '.', c = df['colors'], s = 100)