Python Matplotlib 中的按列值着色
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/14885895/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Color by Column Values in Matplotlib
提问by zach
One of my favorite aspects of using the ggplot2library in R is the ability to easily specify aesthetics. I can quickly make a scatterplot and apply color associated with a specific column and I would love to be able to do this with python/pandas/matplotlib. I'm wondering if there are there any convenience functions that people use to map colors to values using pandas dataframes and Matplotlib?
ggplot2在 R中使用库的我最喜欢的方面之一是能够轻松指定美学。我可以快速制作散点图并应用与特定列相关联的颜色,我希望能够使用 python/pandas/matplotlib 执行此操作。我想知道是否有任何方便的函数可以让人们使用 Pandas 数据框和 Matplotlib 将颜色映射到值?
##ggplot scatterplot example with R dataframe, `df`, colored by col3
ggplot(data = df, aes(x=col1, y=col2, color=col3)) + geom_point()
##ideal situation with pandas dataframe, 'df', where colors are chosen by col3
df.plot(x=col1,y=col2,color=col3)
EDIT: Thank you for your responses but I want to include a sample dataframe to clarify what I am asking. Two columns contain numerical data and the third is a categorical variable. The script I am thinking of will assign colors based on this value.
编辑:感谢您的回复,但我想包含一个示例数据框来阐明我的要求。两列包含数字数据,第三列是分类变量。我正在考虑的脚本将根据此值分配颜色。
import pandas as pd
df = pd.DataFrame({'Height':np.random.normal(10),
'Weight':np.random.normal(10),
'Gender': ["Male","Male","Male","Male","Male",
"Female","Female","Female","Female","Female"]})
采纳答案by Paul H
Update October 2015
2015 年 10 月更新
Seaborn handles this use-case splendidly:
Seaborn 出色地处理了这个用例:
import numpy
import pandas
from matplotlib import pyplot
import seaborn
seaborn.set(style='ticks')
numpy.random.seed(0)
N = 37
_genders= ['Female', 'Male', 'Non-binary', 'No Response']
df = pandas.DataFrame({
'Height (cm)': numpy.random.uniform(low=130, high=200, size=N),
'Weight (kg)': numpy.random.uniform(low=30, high=100, size=N),
'Gender': numpy.random.choice(_genders, size=N)
})
fg = seaborn.FacetGrid(data=df, hue='Gender', hue_order=_genders, aspect=1.61)
fg.map(pyplot.scatter, 'Weight (kg)', 'Height (cm)').add_legend()
Which immediately outputs:
立即输出:
Old Answer
旧答案
In this case, I would use matplotlib directly.
在这种情况下,我会直接使用 matplotlib。
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
def dfScatter(df, xcol='Height', ycol='Weight', catcol='Gender'):
fig, ax = plt.subplots()
categories = np.unique(df[catcol])
colors = np.linspace(0, 1, len(categories))
colordict = dict(zip(categories, colors))
df["Color"] = df[catcol].apply(lambda x: colordict[x])
ax.scatter(df[xcol], df[ycol], c=df.Color)
return fig
if 1:
df = pd.DataFrame({'Height':np.random.normal(size=10),
'Weight':np.random.normal(size=10),
'Gender': ["Male","Male","Unknown","Male","Male",
"Female","Did not respond","Unknown","Female","Female"]})
fig = dfScatter(df)
fig.savefig('fig1.png')
And that gives me:
这给了我:
As far as I know, that color column can be any matplotlib compatible color (RBGA tuples, HTML names, hex values, etc).
据我所知,该颜色列可以是任何与 matplotlib 兼容的颜色(RBGA 元组、HTML 名称、十六进制值等)。
I'm having trouble getting anything but numerical values to work with the colormaps.
除了数值之外,我无法使用颜色图获得任何东西。
回答by tarotcard
You can use the colorparameter to the plot method to define the colors you want for each column. For example:
您可以使用plot 方法的color参数来定义每列所需的颜色。例如:
from pandas import DataFrame
data = DataFrame({'a':range(5),'b':range(1,6),'c':range(2,7)})
colors = ['yellowgreen','cyan','magenta']
data.plot(color=colors)


You can use color names or Color hex codes like '#000000' for black say. You can find all the defined color names in matplotlib's color.py file. Below is the link for the color.py file in matplotlib's github repo.
您可以使用颜色名称或颜色十六进制代码,例如 '#000000' 表示黑色。您可以在 matplotlib 的 color.py 文件中找到所有定义的颜色名称。下面是 matplotlib 的 github 存储库中 color.py 文件的链接。
https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/colors.py
https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/colors.py
回答by Anton Protopopov
Actually you could use ggplot for python:
实际上,您可以将ggplot 用于 python:
from ggplot import *
import numpy as np
import pandas as pd
df = pd.DataFrame({'Height':np.random.randn(10),
'Weight':np.random.randn(10),
'Gender': ["Male","Male","Male","Male","Male",
"Female","Female","Female","Female","Female"]})
ggplot(aes(x='Height', y='Weight', color='Gender'), data=df) + geom_point()
回答by Egor Ignatenkov
https://seaborn.pydata.org/generated/seaborn.scatterplot.html
https://seaborn.pydata.org/generated/seaborn.scatterplot.html
import numpy
import pandas
import seaborn as sns
numpy.random.seed(0)
N = 37
_genders= ['Female', 'Male', 'Non-binary', 'No Response']
df = pandas.DataFrame({
'Height (cm)': numpy.random.uniform(low=130, high=200, size=N),
'Weight (kg)': numpy.random.uniform(low=30, high=100, size=N),
'Gender': numpy.random.choice(_genders, size=N)
})
sns.scatterplot(data=df, x='Height (cm)', y='Weight (kg)', hue='Gender')

