基于索引的 Pandas Dataframe Mask

Question

提问by BrandonAGr

I have the following dataframe:

我有以下数据框：

import pandas as pd
index = pd.date_range('2013-1-1',periods=10,freq='15Min')
data = pd.DataFrame(data=[1,2,3,4,5,6,7,8,9,0], columns=['value'], index=index)

How can I generate a mask based on the index value? I know I can do something like:

如何根据索引值生成掩码？我知道我可以做这样的事情：

data['value'] > 3
Out[40]: 
2013-01-01 00:00:00    False
2013-01-01 00:15:00    False
2013-01-01 00:30:00    False
2013-01-01 00:45:00     True
2013-01-01 01:00:00     True
2013-01-01 01:15:00     True
2013-01-01 01:30:00     True
2013-01-01 01:45:00     True
2013-01-01 02:00:00     True
2013-01-01 02:15:00    False
Freq: 15T, Name: value, dtype: bool

I want to generate a mask to only consider some rows where the index is in a certain range. I was thinking of doing something like data['index'].time() > datetime.time(1,15)to generate a mask. Except of course data['index']fails because index is not the name of a column. How can you reference the index value for a row in a mask?

我想生成一个掩码，只考虑索引在某个范围内的一些行。我正在考虑做一些类似data['index'].time() > datetime.time(1,15)生成面具的事情。除了当然data['index']失败，因为索引不是列的名称。如何引用掩码中一行的索引值？

Answer 1

回答by Andy Hayden

You can mask using indexer_between_time:

您可以使用indexer_between_time以下方法进行屏蔽：

In [11]: data.index.indexer_between_time(start='01:15', end='02:00')
Out[11]: array([5, 6, 7, 8])

In [12]: data.iloc[data.index.indexer_between_time(start='1:15', end='02:00')]
Out[12]:
                     value
2013-01-01 01:15:00      6
2013-01-01 01:30:00      7
2013-01-01 01:45:00      8
2013-01-01 02:00:00      9

As you can see, you access the index by the attribute .index.

如您所见，您可以通过属性访问索引.index。

Note: indexer_between_timeby default both include_startand include_endare True, it also offers a tzargument to compare the time to a different timezone.

注意：indexer_between_time默认情况下，include_start和include_end都为 True，它还提供了一个tz参数来将时间与不同的时区进行比较。

Answer 2

回答by John Saraceno

'start' and 'stop' keywords are deprecated.With pandas >17.1; I had to use the following syntax instead:

不推荐使用“开始”和“停止”关键字。Pandas >17.1；我不得不使用以下语法：

data.iloc[data.index.indexer_between_time('1:15', '02:00')]
Out[90]: 
                     value
2013-01-01 01:15:00      6
2013-01-01 01:30:00      7
2013-01-01 01:45:00      8
2013-01-01 02:00:00      9

基于索引的 Pandas Dataframe Mask

提问by BrandonAGr

回答by Andy Hayden

回答by John Saraceno

相关推荐

最近更新

标签

基于索引的 Pandas Dataframe Mask

提问by BrandonAGr

回答by Andy Hayden

回答by John Saraceno

相关推荐

pandas 熊猫重新采样数据框并将日期时间索引保留为一列

从 Pandas 时间序列生成星期几箱线图的最佳方法

在 python 2.7、ubuntu 12.04 中安装 Pandas

pandas 如何找出 Python 警告的来源

相关推荐

最近更新

标签