Python 如何使用索引迭代熊猫多索引数据帧

Question

提问by Yantraguru

I have a data frame df which looks like this. Date and Time are 2 multilevel index

我有一个数据框 df 看起来像这样。日期和时间是 2 多级索引

                           observation1   observation2
date          Time                             
2012-11-02    9:15:00      79.373668      224
              9:16:00      130.841316     477
2012-11-03    9:15:00      45.312814      835
              9:16:00      123.776946     623
              9:17:00      153.76646      624
              9:18:00      463.276946     626
              9:19:00      663.176934     622
              9:20:00      763.77333      621
2012-11-04    9:15:00      115.449437     122
              9:16:00      123.776946     555
              9:17:00      153.76646      344
              9:18:00      463.276946     212

I want to have do some complex process over daily data block.

我想对日常数据块做一些复杂的处理。

Psuedo code would look like

伪代码看起来像

 for count in df(level 0 index) :
     new_df = get only chunk for count
     complex_process(new_df)

So, first of all, I could not find a way to access only blocks for a date

所以，首先，我找不到只访问日期块的方法

2012-11-03    9:15:00      45.312814      835
              9:16:00      123.776946     623
              9:17:00      153.76646      624
              9:18:00      463.276946     626
              9:19:00      663.176934     622
              9:20:00      763.77333      621

and then send it for processing. I am doing this in for loop as I am not sure if there is any way to do it without mentioning exact value of level 0 column. I did some basic search and able to get df.index.get_level_values(0), but it returns me all the values and that causes loop to run multiple times for a day. I want to create a dataframe per day and send it for processing.

然后送去处理。我在 for 循环中执行此操作，因为我不确定是否有任何方法可以在不提及级别 0 列的确切值的情况下执行此操作。我做了一些基本的搜索并能够获得 df.index.get_level_values(0)，但它返回了我所有的值，这导致循环在一天内运行多次。我想每天创建一个数据帧并将其发送以进行处理。

Answer 1

采纳答案by chrisb

One easy way would be to groupby the first level of the index - iterating over the groupby object will return the group keys and a subframe containing each group.

一种简单的方法是对索引的第一级进行分组 - 迭代 groupby 对象将返回组键和包含每个组的子帧。

In [136]: for date, new_df in df.groupby(level=0):
     ...:     print(new_df)
     ...:     
                    observation1  observation2
date       Time                               
2012-11-02 9:15:00     79.373668           224
           9:16:00    130.841316           477

                    observation1  observation2
date       Time                               
2012-11-03 9:15:00     45.312814           835
           9:16:00    123.776946           623
           9:17:00    153.766460           624
           9:18:00    463.276946           626
           9:19:00    663.176934           622
           9:20:00    763.773330           621

                    observation1  observation2
date       Time                               
2012-11-04 9:15:00    115.449437           122
           9:16:00    123.776946           555
           9:17:00    153.766460           344
           9:18:00    463.276946           212

Answer 2

回答by psorenson

What about this?

那这个呢？

for idate in df.index.get_level_values('date'):
    complex_process(df.ix[idate], idate)

Answer 3

回答by melbay

Tagging off of @psorenson answer, we can get unique level indices and its related data frame slices without numpy as follows:

标记@psorenson 答案，我们可以获得唯一级别索引及其相关数据帧切片，无需 numpy，如下所示：

for date in df.index.get_level_values('date').unique():
    print(df.loc[date])

Python 如何使用索引迭代熊猫多索引数据帧

提问by Yantraguru

采纳答案by chrisb

回答by psorenson

回答by melbay

相关推荐

最近更新

标签

Python 如何使用索引迭代熊猫多索引数据帧

提问by Yantraguru

采纳答案by chrisb

回答by psorenson

回答by melbay

相关推荐

Python Pandas：在具有不同名称的字段上加入 DataFrames？

如何在python中计算数组的导数

如何使用python获取纬度和经度

如何在python 3.3.4中编写一个计算矩形面积的程序？

相关推荐

最近更新

标签