pandas 在熊猫中分配线条颜色

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32525718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:52:28  来源:igfitidea点击:

Assign line colors in pandas

pythonpandas

提问by GebitsGerbils

I am trying to plot some data in pandas and the inbuilt plot function conveniently plots one line per column. What I want to do is to manually assign each line a color based on a classification I make.

我正在尝试在 Pandas 中绘制一些数据,并且内置的绘图功能可以方便地每列绘制一行。我想要做的是根据我所做的分类手动为每条线分配一种颜色。

The following works:

以下工作:

df = pd.DataFrame({'1': [1, 2, 3, 4], '2': [1, 2, 1, 2]})
s = pd.Series(['c','y'], index=['1','2'])
df.plot(color = s)

But when my indices are integers it no longer works and throws as KeyError:

但是当我的索引是整数时,它不再起作用并作为 KeyError 抛出:

df = pd.DataFrame({1: [1, 2, 3, 4], 2: [1, 2, 1, 2]})
s = pd.Series(['c','y'], index=[1,2])
df.plot(color = s)

The way I understand it is that when an integer index is used it somehow has to start from 0. That is my guess since the following works as well:

我的理解是,当使用整数索引时,它必须以某种方式从 0 开始。这是我的猜测,因为以下内容也有效:

df = pd.DataFrame({0: [1, 2, 3, 4], 1: [1, 2, 1, 2]})
s = pd.Series(['c','y'], index=[1,0])
df.plot(color = s)

My question is:

我的问题是:

  • What is happening here?
  • Assuming I have an integer index that does not start from 0 or is not formed of successive numbers, how can I make this work without having to convert the index to string or reindex starting from 0?
  • 这里发生了什么?
  • 假设我有一个不是从 0 开始或不是由连续数字组成的整数索引,我怎样才能在不必将索引转换为字符串或从 0 开始重新索引的情况下进行这项工作?

EDIT:

编辑:

I realised that even in the first case, the code doesn't do what I expected it to do. It seems like pandas matches the index of DataFrame and Series only if both are integer indices starting from 0. If that isn't the case, a KeyError is thrown or if the index is a str the order of the elements is used.

我意识到即使在第一种情况下,代码也没有按照我的预期执行。似乎只有当 DataFrame 和 Series 都是从 0 开始的整数索引时,pandas 才匹配索引。如果不是这种情况,则会抛出 KeyError 或者如果索引是 str 则使用元素的顺序。

Is this correct? And is there a way to match the Series and DataFrame indices? Or do I have to make sure I pass a list of colours in the right order?

这样对吗?有没有办法匹配 Series 和 DataFrame 索引?或者我是否必须确保以正确的顺序传递颜色列表?

回答by thecircus

What is happening here?

这里发生了什么?

The keyword argument color is inherited from matplotlib.pyplot.plot(). The details in the documentation don't make it clear that you can put in a list of colors when plotting. Given that color is a keyword argument from matplotlib, I'd recommend not using a Pandas Series to hold the color values.

关键字参数 color 继承自matplotlib.pyplot.plot()。文档中的细节没有明确说明您可以在绘图时放入颜色列表。鉴于颜色是来自 matplotlib 的关键字参数,我建议不要使用 Pandas 系列来保存颜色值。

How can I make this work?

我怎样才能使这项工作?

Use a list instead of a Series. If you were using a Series with an index meant to match the columns of your DataFrame to specific colors, you will need to sort the Series first. If the columns are not in order, you will need to sort the columnsas well.

使用列表而不是系列。如果您使用的是带有索引的系列,该索引旨在将 DataFrame 的列与特定颜色相匹配,则您需要先对系列进行排序。如果列没有按顺序排列,您还需要对列进行排序

# Option 1
s = s.sort_index()
df.plot(color = s.values) # as per Fiabetto's answer

# Option 2
df.plot(color = ['c', 'y']) # other method

回答by Fabio Lamanna

Try:

尝试:

df.plot(color = s.values)

this will assign the colors no matter the scale of the index.

无论索引的比例如何,这都会分配颜色。

EDIT:

编辑:

I tried with three columns:

我尝试了三列:

df = pd.DataFrame({'1': [1, 2, 3, 4], '2': [1, 2, 1, 2], '3': [4, 3, 2, 1]})
s = pd.Series(['c','y','r'], index=[1,3,2])
df.plot(color = s.sort_index().values)

and sorting the Series it works.

并对其有效的系列进行排序。