Python 如何在数据框中正确设置 Pandas 日期时间对象的日期时间索引?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27032052/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:21:29  来源:igfitidea点击:

How do I properly set the Datetimeindex for a Pandas datetime object in a dataframe?

pythondatetimepandas

提问by user3654387

I have a pandas dataframe:

我有一个熊猫数据框:

    lat         lng         alt days              date        time
0   40.003834   116.321462  211 39745.175405      2008-10-24  04:12:35
1   40.003783   116.321431  201 39745.175463  2008-10-24      04:12:40
2   40.003690   116.321429  203 39745.175521      2008-10-24      04:12:45
3   40.003589   116.321427  194 39745.175579      2008-10-24      04:12:50
4   40.003522   116.321412  190 39745.175637      2008-10-24      04:12:55
5   40.003509   116.321484  188 39745.175694      2008-10-24      04:13:00

For which I am trying to convert the df['date'] and df['time'] columns into a datetime. I can do:

为此,我试图将 df['date'] 和 df['time'] 列转换为日期时间。我可以:

df['Datetime'] = pd.to_datetime(df['date']+df['time'])
df = df.set_index(['Datetime'])
del df['date']
del df['time']

And I get:

我得到:

                    lat         lng         alt days
Datetime                            
2008-10-2404:12:35  40.003834   116.321462  211 39745.175405    
2008-10-2404:12:40  40.003783   116.321431  201 39745.175463
2008-10-2404:12:45  40.003690   116.321429  203 39745.175521    
2008-10-2404:12:50  40.003589   116.321427  194 39745.175579    
2008-10-2404:12:55  40.003522   116.321412  190 39745.175637

But then if I try:

但是如果我尝试:

df.between_time(time(1),time(22,59,59))['lng'].std()

I get an error - 'TypeError: Index must be DatetimeIndex'

我收到一个错误 - 'TypeError: Index must be DatetimeIndex'

So, I've also tried setting the DatetimeIndex:

所以,我也试过设置 DatetimeIndex:

df['Datetime'] = pd.to_datetime(df['date']+df['time'])
#df = df.set_index(['Datetime'])
df = df.set_index(pd.DatetimeIndex(df['Datetime']))
del df['date']
del df['time']

And this throws an error also - 'DateParseError: unknown string format'

这也会引发错误 - 'DateParseError: unknown string format'

How do I create the datetime column and DatetimeIndex correctly so that df.between_time() works right?

如何正确创建日期时间列和 DatetimeIndex 以便 df.between_time() 正常工作?

采纳答案by Kracit

To simplify Kirubaharan's answer a bit:

为了简化基鲁巴哈兰的回答:

df['Datetime'] = pd.to_datetime(df['date'] + ' ' + df['time'])
df = df.set_index('Datetime')

And to get ride of unwanted columns (as OP did but did not specify per se in the question):

并摆脱不需要的列(正如 OP 所做的那样,但没有在问题中本身指定):

df = df.drop(['date','time'], axis=1)

回答by Kirubaharan J

You are not creating datetime index properly,

您没有正确创建日期时间索引,

format = '%Y-%m-%d %H:%M:%S'
df['Datetime'] = pd.to_datetime(df['date'] + ' ' + df['time'], format=format)
df = df.set_index(pd.DatetimeIndex(df['Datetime']))