pandas python pandas按一天中的小时求和

Question

提问by julia_3010

I'm working with the following dataset with hourly counts (df): The datframe has 8784 rows (for the year 2016, hourly).

我正在使用以下每小时计数 (df) 的数据集：数据框有 8784 行（2016 年，每小时）。

I'd like to see if there are daily trends (e.g if there is an increase in the morning hours. For this i'd like to create a plot that has the hour of the day (from 0 to 24) on the x-axis and number of cyclists on the y axis (something like in the picture below from http://ofdataandscience.blogspot.co.uk/2013/03/capital-bikeshare-time-series-clustering.html).

我想看看是否有每日趋势（例如，早上的时间是否增加。为此，我想创建一个在 x- 上包含一天中的小时（从 0 到 24）的图- y 轴上的轴和骑自行车者的数量（类似于下图来自http://ofdataandscience.blogspot.co.uk/2013/03/capital-bikeshare-time-series-clustering.html）。

I experimented with differet ways of pivot, resampleand set_indexand plotting it with matplotlib, without success. In other words, i couldn't find a way to sum up every observation at a certain hour and then plot those for each weekday

我尝试了 , 的不同方式pivot，resample并set_index用 matplotlib 绘制它，但没有成功。换句话说，我找不到一种方法来总结特定时间的每个观察结果，然后在每个工作日绘制它们

Any ideas how to do this? Thanks in advance!

任何想法如何做到这一点？提前致谢！

Answer 1

回答by jezrael

I think you can use groupbyby hourand weekdayand aggregate sum(or maybe mean), last reshape by unstackand DataFrame.plot:

我认为你可以使用groupbyby hourandweekday和聚合sum（或者也许mean），最后通过unstackand重塑DataFrame.plot：

df = df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

Solution with pivot_table:

解决方案pivot_table：

df1 = df.pivot_table(index=df['Date'].dt.hour, 
                     columns='weekday', 
                     values='Cyclists', 
                     aggfunc='sum').plot()

Sample:

样本：

N = 200
np.random.seed(100)
rng = pd.date_range('2016-01-01', periods=N, freq='H')
df = pd.DataFrame({'Date': rng, 'Cyclists': np.random.randint(100, size=N)}) 
df['weekday'] = df['Date'].dt.weekday_name
print (df.head())
   Cyclists                Date weekday
0         8 2016-01-01 00:00:00  Friday
1        24 2016-01-01 01:00:00  Friday
2        67 2016-01-01 02:00:00  Friday
3        87 2016-01-01 03:00:00  Friday
4        79 2016-01-01 04:00:00  Friday

print (df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack())
weekday  Friday  Monday  Saturday  Sunday  Thursday  Tuesday  Wednesday
Date                                                                   
0           102      91       120      53        95       86         21
1           102      83       100      27        20       94         25
2           121      53       105      56        10       98         54
3           164      78        54      30         8       42          6
4           163       0        43      48        89       84         37
5            49      13       150      47        72       95         58
6            24      57        32      39        30       76         39
7           127      76       128      38        12       33         94
8            72       3        59      44        18       58         51
9           138      70        67      18        93       42         30
10           77       3         7      64        92       22         66
11          159      84        49      56        44        0         24
12          156      79        47      34        57       55         55
13           42      10        65      53         0       98         17
14          116      87        61      74        73       19         45
15          106      60        14      17        54       53         89
16           22       3        55      72        92       68         45
17          154      48        71      13        66       62         35
18           60      52        80      30        16       50         16
19           79      43         2      17         5       68         12
20           11      36        94      53        51       35         86
21          180       5        19      68        90       23         82
22          103      71        98      50        34        9         67
23           92      38        63      91        67       48         92

df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

EDIT:

编辑：

You can also convert wekkdayto categoricalfor correct soting of columns by names of week:

你也可以转换wekkday到categorical通过一周的名字列的正确soting：

names = [ 'Monday', 'Tuesday', 'Wednesday', 'Thursday','Friday', 'Saturday', 'Sunday']
df['weekday'] = df['weekday'].astype('category', categories=names, ordered=True)
df.groupby([df['Date'].dt.hour, 'weekday'])['Cyclists'].sum().unstack().plot()

pandas python pandas按一天中的小时求和

提问by julia_3010

回答by jezrael

相关推荐

最近更新

标签

pandas python pandas按一天中的小时求和

提问by julia_3010

回答by jezrael

相关推荐

pandas 阅读大量文档时出现“OSError：从文件初始化失败”

pandas numpy中的groupby，计数和平均值，python中的pandas

pandas 绘制时间序列散点图

pandas 如何在python中的同一个图形上绘制多个密度图

相关推荐

最近更新

标签