pandas 熊猫条形图更改日期格式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30133280/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 15:39:35  来源:igfitidea点击:

Pandas bar plot changes date format

pandasmatplotlibplot

提问by Ted Petrou

I have a simple stacked line plot that has exactly the date format I want magically set when using the following code.

我有一个简单的堆叠线图,它具有使用以下代码时我想要神奇设置的日期格式。

df_ts = df.resample("W", how='max')
df_ts.plot(figsize=(12,8), stacked=True)

enter image description here

在此处输入图片说明

However, the dates mysteriously transform themselves to an ugly and unreadable format when plotting the same data as a bar plot.

但是,在绘制与条形图相同的数据时,日期会神秘地转换为丑陋且不可读的格式。

df_ts = df.resample("W", how='max')
df_ts.plot(kind='bar', figsize=(12,8), stacked=True)

enter image description here

在此处输入图片说明

The original data was transformed a bit to have the weekly max. Why is this radical change in automatically set dates happening? How can I have the nicely formatted dates as above?

原始数据稍微转换为每周最大值。为什么自动设置日期会发生这种根本性的变化?我怎样才能拥有上述格式良好的日期?

Here is some dummy data

这是一些虚拟数据

start = pd.to_datetime("1-1-2012")
idx = pd.date_range(start, periods= 365).tolist()
df=pd.DataFrame({'A':np.random.random(365), 'B':np.random.random(365)})
df.index = idx
df_ts = df.resample('W', how= 'max')
df_ts.plot(kind='bar', stacked=True)

回答by unutbu

The plotting code assumes that each bar in a bar plot deserves its own label. You could override this assumption by specifying your own formatter:

绘图代码假定条形图中的每个条形都有自己的标签。您可以通过指定自己的格式化程序来覆盖此假设:

ax.xaxis.set_major_formatter(formatter)

The pandas.tseries.converter.TimeSeries_DateFormatterthat Pandas uses to format the dates in the "good" plot works well with line plotswhen the x-values are dates. However, with a bar plotthe x-values (at least those received by TimeSeries_DateFormatter.__call__) are merely integers starting at zero. If you try to use TimeSeries_DateFormatterwith a bar plot, all the labels thus start at the Epoch, 1970-1-1 UTC, since this is the date which corresponds to zero. So the formatter used for line plots is unfortunately useless for bar plots (at least as far as I can see).

pandas.tseries.converter.TimeSeries_DateFormatter大熊猫使用在“好”的情节设置格式的日期与行之有效线图当X值是日期。但是,对于条形图,x 值(至少由 接收的那些TimeSeries_DateFormatter.__call__)只是从零开始的整数。如果您尝试使用TimeSeries_DateFormatter条形图,则所有标签都从 UTC 时代 1970-1-1 开始,因为这是对应于零的日期。因此,不幸的是,用于线图的格式化程序对条形图毫无用处(至少就我所见)。

The easiest way I see to produce the desired formatting is to generate and set the labels explicitly:

我看到生成所需格式的最简单方法是显式生成和设置标签:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.ticker as ticker

start = pd.to_datetime("5-1-2012")
idx = pd.date_range(start, periods= 365)
df = pd.DataFrame({'A':np.random.random(365), 'B':np.random.random(365)})
df.index = idx
df_ts = df.resample('W', how= 'max')

ax = df_ts.plot(kind='bar', x=df_ts.index, stacked=True)

# Make most of the ticklabels empty so the labels don't get too crowded
ticklabels = ['']*len(df_ts.index)
# Every 4th ticklable shows the month and day
ticklabels[::4] = [item.strftime('%b %d') for item in df_ts.index[::4]]
# Every 12th ticklabel includes the year
ticklabels[::12] = [item.strftime('%b %d\n%Y') for item in df_ts.index[::12]]
ax.xaxis.set_major_formatter(ticker.FixedFormatter(ticklabels))
plt.gcf().autofmt_xdate()

plt.show()

yields enter image description here

产量 在此处输入图片说明



For those looking for a simple example of a bar plot with dates:

对于那些正在寻找带有日期的条形图的简单示例的人:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

dates = pd.date_range('2012-1-1', '2017-1-1', freq='M')
df = pd.DataFrame({'A':np.random.random(len(dates)), 'Date':dates})
fig, ax = plt.subplots()
df.plot.bar(x='Date', y='A', ax=ax)
ticklabels = ['']*len(df)
skip = len(df)//12
ticklabels[::skip] = df['Date'].iloc[::skip].dt.strftime('%Y-%m-%d')
ax.xaxis.set_major_formatter(mticker.FixedFormatter(ticklabels))
fig.autofmt_xdate()

# fixes the tracker
# https://matplotlib.org/users/recipes.html
def fmt(x, pos=0, max_i=len(ticklabels)-1):
    i = int(x) 
    i = 0 if i < 0 else max_i if i > max_i else i
    return dates[i]
ax.fmt_xdata = fmt
plt.show()

enter image description here

在此处输入图片说明

回答by Arleg

I've struggled with this problem too, and after reading several posts came up with the following solution, which seems to me slightly clearer than matplotlib.datesapproach.

我也一直在努力解决这个问题,在阅读了几篇文章后,提出了以下解决方案,在我看来,这比matplotlib.dates方法更清晰。

Labels without modification:

未修改的标签:

timeline = pd.DatetimeIndex(start='2018, November', freq='M', periods=15)
df = pd.DataFrame({'date': timeline, 'value': np.random.randn(15)})
df.set_index('date', inplace=True)
df.plot(kind='bar', figsize=(12, 8), color='#2ecc71')

enter image description here

在此处输入图片说明

Labels with modification:

修改后的标签:

def line_format(label):
    """
    Convert time label to the format of pandas line plot
    """
    month = label.month_name()[:3]
    if month == 'Jan':
        month += f'\n{label.year}'
    return month

# Note that we specify rot here
ax = df.plot(kind='bar', figsize=(12, 8), color='#2ecc71', rot=0)
ax.set_xticklabels(map(lambda x: line_format(x), df.index))

enter image description here

在此处输入图片说明

This approach will add year to the label only if it is January

仅当是一月时,此方法才会将年份添加到标签

回答by eecharlie

Here's a possibly easier approach using mdates, though requires you to loop over your columns, calling bar plot from matplotlib. Here's an example where I plot just one column and use mdates for customized ticks and labels (EDITAdded looping function to plot all columns stacked):

这是使用 的一种可能更简单的方法mdates,但需要您遍历列,从 matplotlib 调用条形图。这是一个示例,我只绘制一列并使用 mdates 自定义刻度和标签(编辑添加了循环功能以绘制堆叠的所有列):

import datetime
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

def format_x_date_month_day(ax):   
    # Standard date x-axis formatting block, labels each month and ticks each day
    days = mdates.DayLocator()
    months = mdates.MonthLocator()  # every month
    dayFmt = mdates.DateFormatter('%D')
    monthFmt = mdates.DateFormatter('%Y-%m')
    ax.figure.autofmt_xdate()
    ax.xaxis.set_major_locator(months) 
    ax.xaxis.set_major_formatter(monthFmt)
    ax.xaxis.set_minor_locator(days)

def df_stacked_bar_formattable(df, ax, **kwargs):
    P = []
    lastBar = None

    for col in df.columns:
        X = df.index
        Y = df[col]
        if lastBar is not None:
            P.append(ax.bar(X, Y, bottom=lastBar, **kwargs))
        else:
            P.append(ax.bar(X, Y, **kwargs))
        lastBar = Y
    plt.legend([p[0] for p in P], df.columns)

span_days = 90
start = pd.to_datetime("1-1-2012")
idx = pd.date_range(start, periods=span_days).tolist()
df=pd.DataFrame(index=idx, data={'A':np.random.random(span_days), 'B':np.random.random(span_days)})

plt.close('all')
fig, ax = plt.subplots(1)
df_stacked_bar_formattable(df, ax)
format_x_date_month_day(ax)
plt.show()

(Referencing matplotlib.orgfor example of looping to create a stacked bar plot.) This gives us

(参考matplotlib.org例如循环创建堆叠条形图。)这给了我们

enter image description here

在此处输入图片说明

Another approach that shouldwork and be much easier is to use df.plot.bar(ax=ax, stacked=True), however it does not admit date axis formatting with mdatesand is the subject of my question.

另一种应该工作并且更容易使用的方法是使用df.plot.bar(ax=ax, stacked=True),但是它不允许日期轴格式化,mdates并且是我的问题的主题。

回答by Daniel R

Maybe not the most elegant, but hopefully easy way:

也许不是最优雅的,但希望是简单的方法:

fig = plt.figure() 
ax = fig.add_subplot(111)

df_ts.plot(kind='bar', figsize=(12,8), stacked=True,ax=ax)
ax.set_xticklabels(''*len(df_ts.index))

df_ts.plot(linewidth=0, ax=ax)  # This sets the nice x_ticks automatically

[EDIT]: ax=ax neede in df_ts.plot() enter image description here

[编辑]:df_ts.plot() 中的 ax=ax 需要 在此处输入图片说明