如何使用 Python Pandas 绘制堆积事件持续时间(甘特图)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/31820578/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to plot stacked event duration (Gantt Charts) using Python Pandas?
提问by Inkenbrandt
I have a Pandas DataFrame containing the date that a stream gage started measuring flow and the date that the station was decommissioned. I want to generate a plot showing these dates graphically. Here is a sample of my DataFrame:
我有一个 Pandas 数据帧,其中包含流量计开始测量流量的日期和该站退役的日期。我想生成一个以图形方式显示这些日期的图。这是我的 DataFrame 示例:
index       StationId                 amin                 amax
40623  UTAHDWQ-5932100  1994-07-19 13:15:00  1998-06-30 14:51:00
40637  UTAHDWQ-5932230  2006-03-16 13:55:00  2007-01-24 12:55:00
40666  UTAHDWQ-5932240  1980-10-31 16:00:00  2007-07-31 11:35:00
40697  UTAHDWQ-5932250  1981-06-11 17:45:00  1990-08-01 08:30:00
40728  UTAHDWQ-5932253  2006-06-28 13:15:00  2007-01-24 13:35:00
40735  UTAHDWQ-5932254  2006-06-28 13:55:00  2007-01-24 14:05:00
40742  UTAHDWQ-5932280  1981-06-11 15:30:00  2006-08-22 16:00:00
40773  UTAHDWQ-5932290  1992-06-10 15:45:00  1998-06-30 11:33:00
40796  UTAHDWQ-5932750  2005-10-03 16:30:00  2005-10-22 15:00:00
40819  UTAHDWQ-5983753  2006-04-25 09:56:00  2006-04-25 10:00:00
40823  UTAHDWQ-5983754  2006-04-25 11:05:00  2008-04-08 12:16:00
40845  UTAHDWQ-5983755  2006-04-25 13:50:00  2008-04-08 09:10:00
40867  UTAHDWQ-5983756  2006-04-25 14:20:00  2008-04-08 09:30:00
40887  UTAHDWQ-5983757  2006-04-25 12:45:00  2008-04-08 11:27:00
40945  UTAHDWQ-5983759  2008-04-08 13:03:00  2008-04-08 13:05:00
40964  UTAHDWQ-5983760  2008-04-08 13:15:00  2008-04-08 13:23:00
40990  UTAHDWQ-5983775  2008-04-15 12:47:00  2009-04-07 13:15:00
41040  UTAHDWQ-5989066  2005-10-04 10:15:00  2005-10-05 11:40:00
41091  UTAHDWQ-5996780  1995-03-09 13:59:00  1996-03-14 10:40:00
41100  UTAHDWQ-5996800  1995-03-09 15:13:00  1996-03-14 11:05:00
index       StationId                 amin                 amax
40623  UTAHDWQ-5932100  1994-07-19 13:15:00  1998-06-30 14:51:00
40637  UTAHDWQ-5932230  2006-03-16 13:55:00  2007-01-24 12:55:00
40666  UTAHDWQ-5932240  1980-10-31 16:00:00  2007-07-31 11:35:00
40697  UTAHDWQ-5932250  1981-06-11 17:45:00  1990-08-01 08:30:00
40728  UTAHDWQ-5932253  2006-06-28 13:15:00  2007-01-24 13:35:00
40735  UTAHDWQ-5932254  2006-06-28 13:55:00  2007-01-24 14:05:00
40742  UTAHDWQ-5932280  1981-06-11 15:30:00  2006-08-22 16:00:00
40773  UTAHDWQ-5932290  1992-06-10 15:45:00  1998-06-30 11:33:00
40796  UTAHDWQ-5932750  2005-10-03 16:30:00  2005-10-22 15:00:00
40819  UTAHDWQ-5983753  2006-04-25 09:56:00  2006-04-25 10:00:00
40823  UTAHDWQ-5983754  2006-04-25 11:05:00  2008-04-08 12:16:00
40845  UTAHDWQ-5983755  2006-04-25 13:50:00  2008-04-08 09:10:00
40867  UTAHDWQ-5983756  2006-04-25 14:20:00  2008-04-08 09:30:00
40887  UTAHDWQ-5983757  2006-04-25 12:45:00  2008-04-08 11:27:00
40945  UTAHDWQ-5983759  2008-04-08 13:03:00  2008-04-08 13:05:00
40964  UTAHDWQ-5983760  2008-04-08 13:15:00  2008-04-08 13:23:00
40990  UTAHDWQ-5983775  2008-04-15 12:47:00  2009-04-07 13:15:00
41040  UTAHDWQ-5989066  2005-10-04 10:15:00  2005-10-05 11:40:00
41091  UTAHDWQ-5996780  1995-03-09 13:59:00  1996-03-14 10:40:00
41100  UTAHDWQ-5996800  1995-03-09 15:13:00  1996-03-14 11:05:00
I want to create a plot similar to this (please note that I did not make this plot using the above data):

我想创建一个与此类似的图(请注意,我没有使用上述数据制作此图):

The plot does not have to have the text shown along each line, just the y-axis with station names.
该图不必沿每条线显示文本,只需显示带有站名的 y 轴。
While this may seem like a niche application of pandas, I know several scientists that would benefit from this plotting ability.
虽然这看起来像是 Pandas 的一个小众应用,但我知道有几位科学家会从这种绘图能力中受益。
The closest answer I could find is here:
我能找到的最接近的答案在这里:
- How to plot stacked proportional graph?
- How to plot two columns of a pandas data frame using points?
- Matplotlib timelines
- Create gantt Plot with python matplotlib
The last answer is closest to suiting my needs.
最后一个答案最接近我的需要。
While I would prefer a way to do it through the Pandas wrapper, I would be open and grateful to a straight matplotlib solution.
虽然我更喜欢通过 Pandas 包装器来做到这一点,但我会开放并感谢直接的 matplotlib 解决方案。
采纳答案by DTing
I think you are trying to create a gantt plot. Thissuggests using hlines:
我认为您正在尝试创建甘特图。这建议使用hlines:
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dt
df = pd.read_csv('data.csv')
df.amin = pd.to_datetime(df.amin).astype(datetime)
df.amax = pd.to_datetime(df.amax).astype(datetime)
fig = plt.figure()
ax = fig.add_subplot(111)
ax = ax.xaxis_date()
ax = plt.hlines(df.index, dt.date2num(df.amin), dt.date2num(df.amax))
回答by Ben
You can use Bokeh (a python library) to make gantt chart and its really beautiful. Here is a code I copied from a twiiter. http://nbviewer.jupyter.org/gist/quebbs/10416d9fb954020688f2
您可以使用 Bokeh(一个 Python 库)制作甘特图,它非常漂亮。这是我从推特上复制的代码。 http://nbviewer.jupyter.org/gist/quebbs/10416d9fb954020688f2
from bokeh.plotting import figure, show, output_notebook, output_file
from bokeh.models import ColumnDataSource, Range1d
from bokeh.models.tools import HoverTool
from datetime import datetime
from bokeh.charts import Bar
output_notebook()
#output_file('GanntChart.html') #use this to create a standalone html file to send to others
import pandas as ps
DF=ps.DataFrame(columns=['Item','Start','End','Color'])
Items=[
    ['Contract Review & Award','2015-7-22','2015-8-7','red'],
    ['Submit SOW','2015-8-10','2015-8-14','gray'],
    ['Initial Field Study','2015-8-17','2015-8-21','gray'],
    ['Topographic Procesing','2015-9-1','2016-6-1','gray'],
    ['Init. Hydrodynamic Modeling','2016-1-2','2016-3-15','gray'],
    ['Prepare Suitability Curves','2016-2-1','2016-3-1','gray'],
    ['Improvement Conceptual Designs','2016-5-1','2016-6-1','gray'],
    ['Retrieve Water Level Data','2016-8-15','2016-9-15','gray'],
    ['Finalize Hydrodynamic Models','2016-9-15','2016-10-15','gray'],
    ['Determine Passability','2016-9-15','2016-10-1','gray'],
    ['Finalize Improvement Concepts','2016-10-1','2016-10-31','gray'],
    ['Stakeholder Meeting','2016-10-20','2016-10-21','blue'],
    ['Completion of Project','2016-11-1','2016-11-30','red']
    ] #first items on bottom
for i,Dat in enumerate(Items[::-1]):
    DF.loc[i]=Dat
#convert strings to datetime fields:
DF['Start_dt']=ps.to_datetime(DF.Start)
DF['End_dt']=ps.to_datetime(DF.End)
G=figure(title='Project Schedule',x_axis_type='datetime',width=800,height=400,y_range=DF.Item.tolist(),
        x_range=Range1d(DF.Start_dt.min(),DF.End_dt.max()), tools='save')
hover=HoverTool(tooltips="Task: @Item<br>\
Start: @Start<br>\
End: @End")
G.add_tools(hover)
DF['ID']=DF.index+0.8
DF['ID1']=DF.index+1.2
CDS=ColumnDataSource(DF)
G.quad(left='Start_dt', right='End_dt', bottom='ID', top='ID1',source=CDS,color="Color")
#G.rect(,"Item",source=CDS)
show(G)
回答by Avi
It's possible to do this with horizontal bars too: broken_barh(xranges, yrange, **kwargs)
也可以用水平条来做到这一点: broken_barh(xranges, yrange, **kwargs)
回答by MauricioRoman
While I do not know of any way to do this in MatplotLib, you may want to take a look at options with visualizing the data in the way you want by using D3, for example, with this library:
虽然我不知道在 MatplotLib 中有什么方法可以做到这一点,但您可能想看看使用 D3 以您想要的方式可视化数据的选项,例如,使用这个库:
https://github.com/jiahuang/d3-timeline
https://github.com/jiahuang/d3-timeline
If you must do it with Matplotlib, here is one way in which it has been done:
如果你必须用 Matplotlib 来做,这里是一种完成方式:

