pandas python pandas按一天中的小时求和

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43585988/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:27:48  来源:igfitidea点击:

python pandas sum by hour of day

pythonpandasmatplotlibtime-series

提问by julia_3010

I'm working with the following dataset with hourly counts (df): The datframe has 8784 rows (for the year 2016, hourly).

我正在使用以下每小时计数 (df) 的数据集:数据框有 8784 行(2016 年,每小时)。

dataframe (df)

数据框 (df)

I'd like to see if there are daily trends (e.g if there is an increase in the morning hours. For this i'd like to create a plot that has the hour of the day (from 0 to 24) on the x-axis and number of cyclists on the y axis (something like in the picture below from http://ofdataandscience.blogspot.co.uk/2013/03/capital-bikeshare-time-series-clustering.html).

我想看看是否有每日趋势(例如,早上的时间是否增加。为此,我想创建一个在 x- 上包含一天中的小时(从 0 到 24)的图- y 轴上的轴和骑自行车者的数量(类似于下图来自http://ofdataandscience.blogspot.co.uk/2013/03/capital-bikeshare-time-series-clustering.html)。

enter image description here

在此处输入图片说明

I experimented with differet ways of pivot, resampleand set_indexand plotting it with matplotlib, without success. In other words, i couldn't find a way to sum up every observation at a certain hour and then plot those for each weekday

我尝试了 , 的不同方式pivotresampleset_index用 matplotlib 绘制它,但没有成功。换句话说,我找不到一种方法来总结特定时间的每个观察结果,然后在每个工作日绘制它们

Any ideas how to do this? Thanks in advance!

任何想法如何做到这一点?提前致谢!

回答by jezrael

I think you can use groupbyby hourand weekdayand aggregate sum(or maybe mean), last reshape by unstackand DataFrame.plot:

我认为你可以使用groupbyby hourandweekday和聚合sum(或者也许mean),最后通过unstackand重塑DataFrame.plot

df = df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

Solution with pivot_table:

解决方案pivot_table

df1 = df.pivot_table(index=df['Date'].dt.hour, 
                     columns='weekday', 
                     values='Cyclists', 
                     aggfunc='sum').plot()

Sample:

样本:

N = 200
np.random.seed(100)
rng = pd.date_range('2016-01-01', periods=N, freq='H')
df = pd.DataFrame({'Date': rng, 'Cyclists': np.random.randint(100, size=N)}) 
df['weekday'] = df['Date'].dt.weekday_name
print (df.head())
   Cyclists                Date weekday
0         8 2016-01-01 00:00:00  Friday
1        24 2016-01-01 01:00:00  Friday
2        67 2016-01-01 02:00:00  Friday
3        87 2016-01-01 03:00:00  Friday
4        79 2016-01-01 04:00:00  Friday


print (df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack())
weekday  Friday  Monday  Saturday  Sunday  Thursday  Tuesday  Wednesday
Date                                                                   
0           102      91       120      53        95       86         21
1           102      83       100      27        20       94         25
2           121      53       105      56        10       98         54
3           164      78        54      30         8       42          6
4           163       0        43      48        89       84         37
5            49      13       150      47        72       95         58
6            24      57        32      39        30       76         39
7           127      76       128      38        12       33         94
8            72       3        59      44        18       58         51
9           138      70        67      18        93       42         30
10           77       3         7      64        92       22         66
11          159      84        49      56        44        0         24
12          156      79        47      34        57       55         55
13           42      10        65      53         0       98         17
14          116      87        61      74        73       19         45
15          106      60        14      17        54       53         89
16           22       3        55      72        92       68         45
17          154      48        71      13        66       62         35
18           60      52        80      30        16       50         16
19           79      43         2      17         5       68         12
20           11      36        94      53        51       35         86
21          180       5        19      68        90       23         82
22          103      71        98      50        34        9         67
23           92      38        63      91        67       48         92

df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

graph

图形

EDIT:

编辑:

You can also convert wekkdayto categoricalfor correct soting of columns by names of week:

你也可以转换wekkdaycategorical通过一周的名字列的正确soting:

names = [ 'Monday', 'Tuesday', 'Wednesday', 'Thursday','Friday', 'Saturday', 'Sunday']
df['weekday'] = df['weekday'].astype('category', categories=names, ordered=True)
df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

graph1

图1