pandas 来自熊猫数据框的多列的散点图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45147454/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Scatter plot from multiple columns of a pandas dataframe
提问by kkhatri99
I have a pandas dataframe that looks as below:
我有一个如下所示的Pandas数据框:
Filename GalCer(18:1/12:0)_IS GalCer(d18:1/16:0) GalCer(d18:1/18:0)
0 A-1-1 15.0 1.299366 40.662458 0.242658 6.891069 0.180315
1 A-1-2 15.0 1.341638 50.237734 0.270351 8.367316 0.233468
2 A-1-3 15.0 1.583500 47.039423 0.241681 7.902761 0.201153
3 A-1-4 15.0 1.635365 53.139610 0.322680 9.578195 0.345681
4 B-1-10 15.0 2.370330 80.209846 0.463770 13.729810 0.395355
I am trying to plot a scatter sub-plots with a shared x-axis with the first column "Filename" on the x-axis. While I am able to generate barplots, the following code gives me a key error for a scatter plot:
我正在尝试使用共享 x 轴绘制散点子图,其中 x 轴上的第一列“文件名”。虽然我能够生成条形图,但以下代码为我提供了散点图的关键错误:
import matplotlib.pyplot as plt
colnames = list (qqq.columns)
qqq.plot.scatter(x=qqq.Filename, y=colnames[1:], legend=False, subplots = True, sharex = True, figsize = (10,50))
KeyError: "['A-1-1' 'A-1-2' 'A-1-3' 'A-1-4' 'B-1-10' ] not in index"
The following code for barplots works fine. Do I need to specify something differently for the scatterplots?
下面的条形图代码工作正常。我是否需要为散点图指定不同的内容?
import matplotlib.pyplot as plt
colnames = list (qqq.columns)
qqq.plot(x=qqq.Filename, y=colnames[1:], kind = 'bar', legend=False, subplots = True, sharex = True, figsize = (10,30))
回答by ImportanceOfBeingErnest
A scatter plot will require numeric values for both axes. In this case you can use the index as x values,
散点图将需要两个轴的数值。在这种情况下,您可以使用索引作为 x 值,
df.reset_index().plot(x="index", y="other column")
The problem is now that you cannot plot several columns at once using the scatter plot wrapper in pandas. Depending on what the reason for using a scatter plot are, you may decide to use a line plot instead, just without lines. I.e. you may specify linestyle="none"
and marker="o"
to the plot, such that points appear on the plot.
现在的问题是您无法使用 Pandas 中的散点图包装器一次绘制多列。根据使用散点图的原因,您可能决定使用线图代替,只是没有线。即,您可以指定linestyle="none"
和marker="o"
到绘图,以便点出现在绘图上。
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
fn = ["{}_{}".format(i,j) for i in list("ABCD") for j in range(4)]
df = pd.DataFrame(np.random.rand(len(fn), 4), columns=list("ZXYQ"))
df.insert(0,"Filename",pd.Series(fn))
colnames = list (df.columns)
df.reset_index().plot(x="index", y=colnames[1:], kind = 'line', legend=False,
subplots = True, sharex = True, figsize = (5.5,4), ls="none", marker="o")
plt.show()
In case you absolutely need a scatter plot, you may create a subplots grid first and then iterate over the columns and axes to plot one scatter plot at a time to the respective axes.
如果您绝对需要散点图,您可以先创建一个子图网格,然后遍历列和轴,一次将一个散点图绘制到相应的轴上。
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
fn = ["{}_{}".format(i,j) for i in list("ABCD") for j in range(4)]
df = pd.DataFrame(np.random.rand(len(fn), 4), columns=list("ZXYQ"))
df.insert(0,"Filename",pd.Series(fn))
colnames = list (df.columns)
fig, axes = plt.subplots(nrows=len(colnames)-1, sharex = True,figsize = (5.5,4),)
for i, ax in enumerate(axes):
df.reset_index().plot(x="index", y=colnames[i+1], kind = 'scatter', legend=False,
ax=ax, c=colnames[i+1], cmap="inferno")
plt.show()