如何在 Pandas 中获取数据帧的移位索引值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37820130/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:23:55  来源:igfitidea点击:

how to get the shifted index value of a dataframe in Pandas?

pythonpandasdataframedate-rangeshift

提问by ??????

Consider the simple example below:

考虑下面的简单示例:

date = pd.date_range('1/1/2011', periods=5, freq='H')

df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
                         'B']}, index = date)
df
Out[278]: 
                    cat
2011-01-01 00:00:00   A
2011-01-01 01:00:00   A
2011-01-01 02:00:00   A
2011-01-01 03:00:00   B
2011-01-01 04:00:00   B

I want to create a variable that contains the lagged/lead value of the index. That is something like:

我想创建一个包含索引的滞后/领先值的变量。那是这样的:

df['index_shifted']=df.index.shift(1)

So, for instance, at time 2011-01-01 01:00:00I expect the variable index_shiftedto be 2011-01-01 00:00:00

因此,例如,有时2011-01-01 01:00:00我希望变量index_shifted2011-01-01 00:00:00

How can I do that? Thanks!

我怎样才能做到这一点?谢谢!

回答by jezrael

I think you need Index.shiftwith -1:

我认为你需要Index.shift-1

df['index_shifted']= df.index.shift(-1)
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00

For me it works without freq, but maybe it is necessary in real data:

对我来说,它没有freq,但也许在真实数据中是必要的:

df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00

EDIT:

编辑:

If freqof DatetimeIndexis None, you need add freqto shift:

如果freqDatetimeIndex就是None,你需要添加freqshift

import pandas as pd

date = pd.date_range('1/1/2011', periods=5, freq='H').union(pd.date_range('5/1/2011', periods=5, freq='H'))


df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
                         'B','A', 'A', 'A', 'B',
                         'B']}, index = date)

print (df.index)
DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
               '2011-01-01 02:00:00', '2011-01-01 03:00:00',
               '2011-01-01 04:00:00', '2011-05-01 00:00:00',
               '2011-05-01 01:00:00', '2011-05-01 02:00:00',
               '2011-05-01 03:00:00', '2011-05-01 04:00:00'],
              dtype='datetime64[ns]', freq=None)

df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
                    cat       index_shifted
2011-01-01 00:00:00   A 2010-12-31 23:00:00
2011-01-01 01:00:00   A 2011-01-01 00:00:00
2011-01-01 02:00:00   A 2011-01-01 01:00:00
2011-01-01 03:00:00   B 2011-01-01 02:00:00
2011-01-01 04:00:00   B 2011-01-01 03:00:00
2011-05-01 00:00:00   A 2011-04-30 23:00:00
2011-05-01 01:00:00   A 2011-05-01 00:00:00
2011-05-01 02:00:00   A 2011-05-01 01:00:00
2011-05-01 03:00:00   B 2011-05-01 02:00:00
2011-05-01 04:00:00   B 2011-05-01 03:00:00

回答by zw324

What's wrong with df['index_shifted']=df.index.shift(-1)?

怎么了df['index_shifted']=df.index.shift(-1)

(Genuine question, not sure if I missed something)

(真正的问题,不确定我是否遗漏了什么)