pandas 熊猫,matplotlib,使用数据帧索引作为轴刻度标签
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11586989/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas, matplotlib, use dataframe index as axis tick labels
提问by tbc
I am using matplotlib's imshow()function to show a pandas.DataFrame.
我正在使用 matplotlib 的imshow()函数来显示pandas.DataFrame.
I would like the labels and ticks for both x and y axes to be drawn from the DataFrame.index and DataFrame.columns lists, but I can't figure out how to do it.
我想从 DataFrame.index 和 DataFrame.columns 列表中绘制 x 和 y 轴的标签和刻度,但我不知道如何做。
Assuming that datais a pandas.DataFrame:
假设这data是一个pandas.DataFrame:
>>> print data
<class 'pandas.core.frame.DataFrame'>
Index: 201 entries, 1901 to 2101
Data columns:
jan 201 non-null values
feb 201 non-null values
mar 201 non-null values
apr 201 non-null values
may 201 non-null values
jun 201 non-null values
jul 201 non-null values
aug 201 non-null values
sep 201 non-null values
oct 201 non-null values
nov 201 non-null values
dec 201 non-null values
When I do this:
当我这样做时:
ax1 = fig.add_subplot(131, xticklabels=data.columns, yticklabels=data.index)
ax1.set_title("A")
ax1.tick_params(axis='both', direction='out')
im1 = ax1.imshow(data,
interpolation='nearest',
aspect='auto',
cmap=cmap )
I end up with nicely spaced tick labels on the y axis of the image, but the labels are 1901-1906 instead of 1901 thru 2101. Likewise, the x axis tick labels are feb-jul instead of jan-dec.
我最终在图像的 y 轴上得到了间隔很好的刻度标签,但标签是 1901-1906 而不是 1901 到 2101。同样,x 轴刻度标签是 feb-jul 而不是 jan-dec。
If I use
如果我使用
ax1 = fig.add_subplot(131) # without specifying tick labels
Then I end up with the axis tick labels simply being the underlying ndarray index values (i.e. 0-201 and 0-12). I don't need to modify the spacing or quantity of ticks and labels, I just want the label text to come from the DataFrame index and column lists. Not sure if I am missing something easy or not?
然后我最终得到轴刻度标签只是底层的 ndarray 索引值(即 0-201 和 0-12)。我不需要修改刻度和标签的间距或数量,我只希望标签文本来自 DataFrame 索引和列列表。不确定我是否遗漏了一些简单的东西?
Thanks in advance.
提前致谢。
采纳答案by PJW
As a general solution, I have found the following method to be an easy way to bring a Pandas datetime64 index into a matplotlib axis label.
作为通用解决方案,我发现以下方法是将 Pandas datetime64 索引带入 matplotlib 轴标签的简单方法。
First, create a new series by converting the pandas datetime64 index to a Python datetime.datetime class.
首先,通过将 pandas datetime64 索引转换为 Python datetime.datetime 类来创建一个新系列。
new_series = your_pandas_dataframe.index.to_pydatetime()
new_series = your_pandas_dataframe.index.to_pydatetime()
Now you have all the functionality of matplotlib.dates. Before plotting, import matplotlib.dates as mdates and declare the following variables:
现在您拥有了 matplotlib.dates 的所有功能。在绘图之前,将 matplotlib.dates 导入为 mdates 并声明以下变量:
years = mdates.YearLocator()
months = mdates.MonthLocator()
days = mdates.DayLocator()
hours = mdates.HourLocator(12) #if you want ticks every 12 hrs, you can pass 12 to this function
minutes = mdates.MinuteLocator()
daysFmt = mdates.DateFormatter('%m/%d') #or whatever format you want
Now, make your plots, using the new_series as the x-axis:
现在,使用 new_series 作为 x 轴绘制您的图:
fig1 = plt.figure()
ax = fig1.add_subplot(111)
ax.plot(new_series,your_pandas_dataframe)
You can use the mdates functions declared above to tweak the labels and ticks to your pleasing, such as:
您可以使用上面声明的 mdates 函数来调整标签和刻度以使其满意,例如:
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(daysFmt)
ax.xaxis.set_minor_locator(hours)
回答by zarthur
I believe the issue has to do with specifying the tick labels for existing ticks. By default, there are fewer ticks than labels so only the first few labels are used. The following should work by first setting the number of ticks.
我相信这个问题与指定现有刻度的刻度标签有关。默认情况下,刻度数少于标签,因此只使用前几个标签。以下应该首先设置刻度数。
ax1 = fig.add_subplot(131)
ax1.set_title("A")
ax1.tick_params(axis='both', direction='out')
ax1.set_xticks(range(len(data.columns)))
ax1.set_xticklabels(data.columns)
ax1.set_yticks(range(len(data.index)))
ax1.set_yticklabels(data.index)
im1 = ax1.imshow(data, interpolation='nearest', aspect='auto', cmap=cmap)
This produces a tick for every year on the y-axis, so you might want to use a subset of the index values.
这会在 y 轴上为每一年生成一个刻度,因此您可能希望使用索引值的子集。
回答by Phillip Cloud
I've found that the easiest way to do this is with ImageGrid. Here's the code to do this and the plot + here is an IPython notebookthat shows it in a more presentable format:
我发现最简单的方法是使用ImageGrid. 这是执行此操作的代码,图 +这里是一个 IPython 笔记本,以更像样的格式显示它:
mons = ['Jan',
'Feb',
'Mar',
'Apr',
'May',
'Jun',
'Jul',
'Aug',
'Sep',
'Oct',
'Nov',
'Dec']
# just get the first 5 for illustration purposes
df = DataFrame(randn(201, len(mons)), columns=mons,
index=arange(1901, 2102))[:5]
from mpl_toolkits.axes_grid1 import ImageGrid
fig = figure(figsize=(20, 100))
grid = ImageGrid(fig, 111, nrows_ncols=(1, 1),
direction='row', axes_pad=0.05, add_all=True,
label_mode='1', share_all=False,
cbar_location='right', cbar_mode='single',
cbar_size='10%', cbar_pad=0.05)
ax = grid[0]
ax.set_title('A', fontsize=40)
ax.tick_params(axis='both', direction='out', labelsize=20)
im = ax.imshow(df.values, interpolation='nearest', vmax=df.max().max(),
vmin=df.min().min())
ax.cax.colorbar(im)
ax.cax.tick_params(labelsize=20)
ax.set_xticks(arange(df.shape[1]))
ax.set_xticklabels(mons)
ax.set_yticks(arange(df.shape[0]))
ax.set_yticklabels(df.index)



