pandas 来自熊猫数据框的多列的散点图

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45147454/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:01:28  来源:igfitidea点击:

Scatter plot from multiple columns of a pandas dataframe

pythonpandasmatplotlib

提问by kkhatri99

I have a pandas dataframe that looks as below:

我有一个如下所示的Pandas数据框:

    Filename    GalCer(18:1/12:0)_IS    GalCer(d18:1/16:0)  GalCer(d18:1/18:0)  

0   A-1-1   15.0    1.299366    40.662458   0.242658    6.891069    0.180315    

1   A-1-2   15.0    1.341638    50.237734   0.270351    8.367316    0.233468    

2   A-1-3   15.0    1.583500    47.039423   0.241681    7.902761    0.201153    

3   A-1-4   15.0    1.635365    53.139610   0.322680    9.578195    0.345681    

4   B-1-10  15.0    2.370330    80.209846   0.463770    13.729810   0.395355

I am trying to plot a scatter sub-plots with a shared x-axis with the first column "Filename" on the x-axis. While I am able to generate barplots, the following code gives me a key error for a scatter plot:

我正在尝试使用共享 x 轴绘制散点子图,其中 x 轴上的第一列“文件名”。虽然我能够生成条形图,但以下代码为我提供了散点图的关键错误:

import matplotlib.pyplot as plt
colnames = list (qqq.columns)

qqq.plot.scatter(x=qqq.Filename, y=colnames[1:], legend=False, subplots = True, sharex = True, figsize = (10,50))

KeyError: "['A-1-1' 'A-1-2' 'A-1-3' 'A-1-4' 'B-1-10' ] not in index"

The following code for barplots works fine. Do I need to specify something differently for the scatterplots?

下面的条形图代码工作正常。我是否需要为散点图指定不同的内容?

import matplotlib.pyplot as plt
colnames = list (qqq.columns)
qqq.plot(x=qqq.Filename, y=colnames[1:], kind = 'bar', legend=False, subplots = True, sharex = True, figsize = (10,30))

回答by ImportanceOfBeingErnest

A scatter plot will require numeric values for both axes. In this case you can use the index as x values,

散点图将需要两个轴的数值。在这种情况下,您可以使用索引作为 x 值,

df.reset_index().plot(x="index", y="other column")

The problem is now that you cannot plot several columns at once using the scatter plot wrapper in pandas. Depending on what the reason for using a scatter plot are, you may decide to use a line plot instead, just without lines. I.e. you may specify linestyle="none"and marker="o"to the plot, such that points appear on the plot.

现在的问题是您无法使用 Pandas 中的散点图包装器一次绘制多列。根据使用散点图的原因,您可能决定使用线图代替,只是没有线。即,您可以指定linestyle="none"marker="o"到绘图,以便点出现在绘图上。

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

fn = ["{}_{}".format(i,j) for i in list("ABCD") for j in range(4)]
df = pd.DataFrame(np.random.rand(len(fn), 4), columns=list("ZXYQ"))
df.insert(0,"Filename",pd.Series(fn))

colnames = list (df.columns)
df.reset_index().plot(x="index", y=colnames[1:], kind = 'line', legend=False, 
                 subplots = True, sharex = True, figsize = (5.5,4), ls="none", marker="o")

plt.show()

enter image description here

在此处输入图片说明

In case you absolutely need a scatter plot, you may create a subplots grid first and then iterate over the columns and axes to plot one scatter plot at a time to the respective axes.

如果您绝对需要散点图,您可以先创建一个子图网格,然后遍历列和轴,一次将一个散点图绘制到相应的轴上。

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

fn = ["{}_{}".format(i,j) for i in list("ABCD") for j in range(4)]
df = pd.DataFrame(np.random.rand(len(fn), 4), columns=list("ZXYQ"))
df.insert(0,"Filename",pd.Series(fn))

colnames = list (df.columns)
fig, axes = plt.subplots(nrows=len(colnames)-1, sharex = True,figsize = (5.5,4),)

for i, ax in enumerate(axes):
    df.reset_index().plot(x="index", y=colnames[i+1], kind = 'scatter', legend=False, 
                          ax=ax, c=colnames[i+1], cmap="inferno")

plt.show()

enter image description here

在此处输入图片说明