从 Pandas 数据框中绘制和格式化 seaborn 图表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23160730/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:56:36  来源:igfitidea点击:

plotting & formatting seaborn chart from pandas dataframe

pythonmatplotlibpandasseaborn

提问by Luis Miguel

I have a pandas dataframe al_dfthat contains the population of Alabama from a recent US census. I created a cumulative function that I plot using seaborn, resulting in this chart:

我有一个 Pandas 数据框al_df,其中包含来自最近美国人口普查的阿拉巴马州人口。我创建了一个使用 绘制的累积函数seaborn,从而得到了这个图表:

CDF for the population of Alabama

阿拉巴马州人口的 CDF

The code that relates to the plotting is this:

与绘图相关的代码是这样的:

figure(num=None, figsize=(20, 10))

plt.title('Cumulative Distribution Function for ALABAMA population')
plt.xlabel('City')
plt.ylabel('Percentage')
#sns.set_style("whitegrid", {"ytick.major.size": "0.1",})
plt.plot(al_df.pop_cum_perc)

My questions are: 1) How can I change the ticks, so the yaxis shows a grid line every 0.1 units instead of the default 0.2 shown? 2) How can I change the x axis to show the actual names of the city, plotted vertically, instead of the "rank" of the city (from the Pandas index)? (there are over 300 names, so they are not going to fit well horizontally).

我的问题是: 1) 如何更改刻度,以便 yaxis 每 0.1 个单位显示一条网格线,而不是显示的默认 0.2?2)如何更改 x 轴以显示垂直绘制的城市的实际名称,而不是城市的“排名”(来自 Pandas 索引)?(有超过 300 个名称,因此它们在水平方向上不太适合)。

回答by Pablo Reyes

For question 1) ,add:

对于问题 1),添加:

plt.yticks(np.arange(0,1+0.1,0.1))

Question 2), I found this in the matplotlib gallery: ticks_and_spines example code

问题 2),我在 matplotlib 库中找到了这个: ticks_and_spines 示例代码

回答by CT Zhu

The matplotlibway would be to use MutlipLocator. The second one is also straight forward

matplotlib方法是使用MutlipLocator。第二个也是直截了当

from matplotlib.ticker import *
plt.plot(range(10))
ax=plt.gca()
ax.yaxis.set_major_locator(MultipleLocator(0.5))
plt.xticks(range(10), list('ABCDEFGHIJ'), rotation=90) #would be range(3xx), List_of_city_names, rotation=90
plt.savefig('temp.png')

enter image description here

在此处输入图片说明

回答by Luis Miguel

After some research, and not been able to find a "native" Seaborn solution, I came up with the code below, partially based on @Pablo Reyes and @CT Zhu suggestions, and using matplotlib functions:

经过一些研究,并且无法找到“原生”Seaborn 解决方案,我想出了下面的代码,部分基于@Pablo Reyes 和 @CT Zhu 的建议,并使用了 matplotlib 函数:

from matplotlib.ticker import *
figure(num=None, figsize=(20, 10))

plt.title('Cumulative Distribution Function for ALABAMA population')
plt.xlabel('City')
plt.ylabel('Percentage')
plt.plot(al_df.pop_cum_perc)

#set the tick size of y axis
ax = plt.gca()
ax.yaxis.set_major_locator(MultipleLocator(0.1))

#set the labels of y axis and text orientation
ax.xaxis.set_major_locator(MultipleLocator(10))
ax.set_xticklabels(labels, rotation =90)

The solution introduced a new element "labels" which I had to specify before the plot, as a new Python list created from my Pandas dataframe:

该解决方案引入了一个新元素“标签”,我必须在绘图之前将其指定为从我的 Pandas 数据帧创建的新 Python 列表:

labels = al_df.NAME.values[:]

Producing the following chart: enter image description here

生成以下图表: 在此处输入图片说明

This requires some tweaking, since specifying a display of every city in the pandas data frame, like this:

这需要一些调整,因为在 pandas 数据框中指定每个城市的显示,如下所示:

ax.xaxis.set_major_locator(MultipleLocator(1))

Produces a chart impossible to read (displaying only x axis): enter image description here

生成无法阅读的图表(仅显示 x 轴): 在此处输入图片说明