pandas 获取pandas中某个索引值前后的行数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29819671/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:15:10  来源:igfitidea点击:

Get number of rows before and after a certain index value in pandas

pythonpandas

提问by jkokorian

Let's say I have the following:

假设我有以下内容:

In [1]: import pandas as pd
        import numpy as np
        df = pd.DataFrame(data=np.random.rand(11),index=pd.date_range('2015-04-20','2015-04-30'),columns=['A'])
Out[1]: 
               A
2015-04-20  0.694983
2015-04-21  0.393851
2015-04-22  0.690138
2015-04-23  0.674222
2015-04-24  0.763175
2015-04-25  0.761917
2015-04-26  0.999274
2015-04-27  0.907871
2015-04-28  0.464818
2015-04-29  0.005733
2015-04-30  0.806351

I have some complicated method that identifies a single index as being interesting, for example '2015-04-25'. I can retrieve the row with that index using:

我有一些复杂的方法可以将单个索引标识为有趣的,例如“2015-04-25”。我可以使用该索引检索行:

In [2]: df.loc['2015-04-25']
Out[2]: 
A    0.761917
Name: 2015-04-25 00:00:00, dtype: float64

What would be the nicest way to obtain a number of n rows before and/or after that index value?

在该索引值之前和/或之后获取 n 行数的最佳方法是什么?

What I would like to do is something like:

我想做的是:

In[3]: df.getRowsBeforeLoc('2015-04-25',3)
Out[3]:
2015-04-22  0.690138
2015-04-23  0.674222
2015-04-24  0.763175
2015-04-25  0.761917

Or equivalently:

或等效地:

In[3]: df.getRowsAfterLoc('2015-04-25',3)
Out[3]:
2015-04-25  0.761917
2015-04-26  0.999274
2015-04-27  0.907871
2015-04-28  0.464818

(I don't have a strong opinion on whether or not the row that corresponds to the target index value itself is included.)

(对于是否包含与目标索引值本身对应的行,我没有强烈的意见。)

回答by EdChum

locsupports slicing the beg/end point is included in the range:

loc支持切片 beg/end 点包含在范围内:

In [363]:

df.loc[:'2015-04-25']
Out[363]:
                   A
2015-04-25  0.141787
2015-04-26  0.598237
2015-04-27  0.106461
2015-04-28  0.297159
2015-04-29  0.058392
2015-04-30  0.621325
In [364]:

df.loc['2015-04-25':]
Out[364]:
                   A
2015-04-25  0.141787
2015-04-26  0.598237
2015-04-27  0.106461
2015-04-28  0.297159
2015-04-29  0.058392
2015-04-30  0.621325

To get either first/last (n) rows use head/tail:

要获取第一行/最后一行 (n) 行,请使用head/ tail

In [378]:

df.loc[:'2015-04-25'].head(3)
Out[378]:
                   A
2015-04-20  0.827699
2015-04-21  0.901140
2015-04-22  0.427304

In [377]:

df.loc[:'2015-04-25'].tail(3)
Out[377]:
                   A
2015-04-23  0.002189
2015-04-24  0.041965
2015-04-25  0.141787

update

更新

To get the row before/after a specifc value we can use get_locon the index to return an integer position and then use this with ilocto get the previous/next row:

要获取特定值之前/之后的行,我们可以get_loc在索引上使用以返回整数位置,然后使用它iloc来获取上一行/下一行:

In [388]:

df.index.get_loc('2015-04-25')
Out[388]:
5
In [391]:

df.iloc[df.index.get_loc('2015-04-25')-1]
Out[391]:
A    0.041965
Name: 2015-04-24 00:00:00, dtype: float64
In [392]:

df.iloc[df.index.get_loc('2015-04-25')+1]
Out[392]:
A    0.598237
Name: 2015-04-26 00:00:00, dtype: float64