如何在 Pandas 中获取数据帧的移位索引值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37820130/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to get the shifted index value of a dataframe in Pandas?
提问by ??????
Consider the simple example below:
考虑下面的简单示例:
date = pd.date_range('1/1/2011', periods=5, freq='H')
df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
'B']}, index = date)
df
Out[278]:
cat
2011-01-01 00:00:00 A
2011-01-01 01:00:00 A
2011-01-01 02:00:00 A
2011-01-01 03:00:00 B
2011-01-01 04:00:00 B
I want to create a variable that contains the lagged/lead value of the index. That is something like:
我想创建一个包含索引的滞后/领先值的变量。那是这样的:
df['index_shifted']=df.index.shift(1)
So, for instance, at time 2011-01-01 01:00:00
I expect the variable index_shifted
to be 2011-01-01 00:00:00
因此,例如,有时2011-01-01 01:00:00
我希望变量index_shifted
是2011-01-01 00:00:00
How can I do that? Thanks!
我怎样才能做到这一点?谢谢!
回答by jezrael
I think you need Index.shift
with -1
:
我认为你需要Index.shift
有-1
:
df['index_shifted']= df.index.shift(-1)
print (df)
cat index_shifted
2011-01-01 00:00:00 A 2010-12-31 23:00:00
2011-01-01 01:00:00 A 2011-01-01 00:00:00
2011-01-01 02:00:00 A 2011-01-01 01:00:00
2011-01-01 03:00:00 B 2011-01-01 02:00:00
2011-01-01 04:00:00 B 2011-01-01 03:00:00
For me it works without freq
, but maybe it is necessary in real data:
对我来说,它没有freq
,但也许在真实数据中是必要的:
df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
cat index_shifted
2011-01-01 00:00:00 A 2010-12-31 23:00:00
2011-01-01 01:00:00 A 2011-01-01 00:00:00
2011-01-01 02:00:00 A 2011-01-01 01:00:00
2011-01-01 03:00:00 B 2011-01-01 02:00:00
2011-01-01 04:00:00 B 2011-01-01 03:00:00
EDIT:
编辑:
If freq
of DatetimeIndex
is None
, you need add freq
to shift
:
如果freq
的DatetimeIndex
就是None
,你需要添加freq
到shift
:
import pandas as pd
date = pd.date_range('1/1/2011', periods=5, freq='H').union(pd.date_range('5/1/2011', periods=5, freq='H'))
df = pd.DataFrame({'cat' : ['A', 'A', 'A', 'B',
'B','A', 'A', 'A', 'B',
'B']}, index = date)
print (df.index)
DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
'2011-01-01 02:00:00', '2011-01-01 03:00:00',
'2011-01-01 04:00:00', '2011-05-01 00:00:00',
'2011-05-01 01:00:00', '2011-05-01 02:00:00',
'2011-05-01 03:00:00', '2011-05-01 04:00:00'],
dtype='datetime64[ns]', freq=None)
df['index_shifted']= df.index.shift(-1, freq='H')
print (df)
cat index_shifted
2011-01-01 00:00:00 A 2010-12-31 23:00:00
2011-01-01 01:00:00 A 2011-01-01 00:00:00
2011-01-01 02:00:00 A 2011-01-01 01:00:00
2011-01-01 03:00:00 B 2011-01-01 02:00:00
2011-01-01 04:00:00 B 2011-01-01 03:00:00
2011-05-01 00:00:00 A 2011-04-30 23:00:00
2011-05-01 01:00:00 A 2011-05-01 00:00:00
2011-05-01 02:00:00 A 2011-05-01 01:00:00
2011-05-01 03:00:00 B 2011-05-01 02:00:00
2011-05-01 04:00:00 B 2011-05-01 03:00:00
回答by zw324
What's wrong with df['index_shifted']=df.index.shift(-1)
?
怎么了df['index_shifted']=df.index.shift(-1)
?
(Genuine question, not sure if I missed something)
(真正的问题,不确定我是否遗漏了什么)