pandas 用逐年数据绘制熊猫数据框

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30379789/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:23:01  来源:igfitidea点击:

Plot pandas data frame with year over year data

pythonpandas

提问by Ivan

I have a data frame in the format

我有一个格式的数据框

              value
2000-01-01    1
2000-03-01    2
2000-06-01    15
2000-09-01    3
2000-12-01    7
2001-01-01    1
2001-03-01    3
2001-06-01    8
2001-09-01    5
2001-12-01    3
2002-01-01    1
2002-03-01    1
2002-06-01    8
2002-09-01    5
2002-12-01    19

(index is datetime) and I need to plot all results year over year to compare the results each 3 months (The data can be monthly, too), plus the average of all years.

(索引是日期时间),我需要逐年绘制所有结果,以每 3 个月比较一次结果(数据也可以是每月一次),加上所有年份的平均值。

I can easily plot they separately, but because of the index, it will shift the plots according with the index:

我可以轻松地分别绘制它们,但由于索引,它会根据索引移动图:

fig, axes = plt.subplots()
df['2000'].plot(ax=axes, label='2000')
df['2001'].plot(ax=axes, label='2001')
df['2002'].plot(ax=axes, label='2002')
axes.plot(df["2000":'2002'].groupby(df["2000":'2002'].index.month).mean())

So it's not the desired result. I've seem some answers here, but you have to concat, create a multiindex and plot. If one of the data frames has NaNs or missing values, it can be very cumbersome. Is there a pandas way to do it?

所以这不是想要的结果。我在这里似乎有一些答案,但是您必须连接、创建多索引和绘图。如果其中一个数据框有 NaN 或缺失值,则可能会非常麻烦。有大Pandas的方法吗?

回答by sinhrks

Is this what you want? You can add means after transformation.

这是你想要的吗?您可以在转换后添加手段。

df = pd.DataFrame({'value': [1, 2, 15, 3, 7, 1, 3, 8, 5, 3, 1, 1, 8, 5, 19]},
              index=pd.DatetimeIndex(['2000-01-01', '2000-03-01', '2000-06-01', '2000-09-01',
                                      '2000-12-01', '2001-01-01', '2001-03-01', '2001-06-01',
                                      '2001-09-01', '2001-12-01', '2002-01-01', '2002-03-01',
                                      '2002-06-01', '2002-09-01', '2002-12-01']))


pv = pd.pivot_table(df, index=df.index.month, columns=df.index.year,
                    values='value', aggfunc='sum')
pv
#     2000  2001  2002
# 1      1     1     1
# 3      2     3     1
# 6     15     8     8
# 9      3     5     5
# 12     7     3    19

pv.plot()

enter image description here

在此处输入图片说明

回答by joris

One possibility is to use the 'day of the year' as x-axis. Using the xkwarg to override the index of the dataframe as x-axis:

一种可能性是使用“一年中某一天”作为 x 轴。使用xkwarg 将数据帧的索引覆盖为 x 轴:

fig, axes = plt.subplots()
df['2000'].plot(ax=axes, label='2000', x=df['2000'].index.dayofyear)
df['2001'].plot(ax=axes, label='2001', x=df['2001'].index.dayofyear)

Alternatively, you can also add this as a column, and then refer to the column name.

或者,您也可以将其添加为列,然后引用列名。

If it are monthly data, then you an of course use the monthattribute of the index as well.

如果是月度数据,当然也可以使用month索引的属性。

The disadvantage of the above approach is that you don't have the nice datetime formatting of the x-axis.

上述方法的缺点是您没有很好的 x 轴日期时间格式。