使用 python pandas 合并日期和时间列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17978092/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:39:41  来源:igfitidea点击:

Combine Date and Time columns using python pandas

pythonpandasdatetimeseries

提问by richie

I have a pandas dataframe with the following columns;

我有一个包含以下列的熊猫数据框;

Date              Time
01-06-2013      23:00:00
02-06-2013      01:00:00
02-06-2013      21:00:00
02-06-2013      22:00:00
02-06-2013      23:00:00
03-06-2013      01:00:00
03-06-2013      21:00:00
03-06-2013      22:00:00
03-06-2013      23:00:00
04-06-2013      01:00:00

How do I combine data['Date'] & data['Time'] to get the following? Is there a way of doing it using pd.to_datetime?

如何组合 data['Date'] 和 data['Time'] 以获得以下内容?有没有办法做到这一点pd.to_datetime

Date
01-06-2013 23:00:00
02-06-2013 01:00:00
02-06-2013 21:00:00
02-06-2013 22:00:00
02-06-2013 23:00:00
03-06-2013 01:00:00
03-06-2013 21:00:00
03-06-2013 22:00:00
03-06-2013 23:00:00
04-06-2013 01:00:00

采纳答案by Andy Hayden

It's worth mentioning that you may have been able to read this in directlye.g. if you were using read_csvusing parse_dates=[['Date', 'Time']].

值得一提的是,您可能已经能够直接阅读此内容,例如,如果您使用read_csvusing parse_dates=[['Date', 'Time']].

Assuming these are just strings you could simply add them together (with a space), allowing you to apply to_datetime:

假设这些只是字符串,您可以简单地将它们添加在一起(带空格),允许您应用to_datetime

In [11]: df['Date'] + ' ' + df['Time']
Out[11]:
0    01-06-2013 23:00:00
1    02-06-2013 01:00:00
2    02-06-2013 21:00:00
3    02-06-2013 22:00:00
4    02-06-2013 23:00:00
5    03-06-2013 01:00:00
6    03-06-2013 21:00:00
7    03-06-2013 22:00:00
8    03-06-2013 23:00:00
9    04-06-2013 01:00:00
dtype: object

In [12]: pd.to_datetime(df['Date'] + ' ' + df['Time'])
Out[12]:
0   2013-01-06 23:00:00
1   2013-02-06 01:00:00
2   2013-02-06 21:00:00
3   2013-02-06 22:00:00
4   2013-02-06 23:00:00
5   2013-03-06 01:00:00
6   2013-03-06 21:00:00
7   2013-03-06 22:00:00
8   2013-03-06 23:00:00
9   2013-04-06 01:00:00
dtype: datetime64[ns]

Note: surprisingly (for me), this works fine with NaNs being converted to NaT, but it is worth worrying that the conversion (perhaps using the raiseargument).

注意:令人惊讶的是(对我而言),这在将 NaN 转换为 NaT 时效果很好,但值得担心的是转换(可能使用raise参数)。

回答by jka.ne

The accepted answer works for columns that are of datatype string. For completeness: I come across this question when searching how to do this when the columns are of datatypes: date and time.

接受的答案适用于数据类型的列string。为了完整起见:当列的数据类型为日期和时间时,我在搜索如何执行此操作时遇到了这个问题。

df.apply(lambda r : pd.datetime.combine(r['date_column_name'],r['time_column_name']),1)

回答by jabellcu

I don't have enough reputation to comment on jka.neso:

我没有足够的声誉来评论jka.ne所以:

I had to amend jka.ne's linefor it to work:

我不得不修改jka.ne 的行才能让它工作:

df.apply(lambda r : pd.datetime.combine(r['date_column_name'],r['time_column_name']).time(),1)

This might help others.

这可能会帮助其他人。

Also, I have tested a different approach, using replaceinstead of combine:

另外,我测试了一种不同的方法,使用replace代替combine

def combine_date_time(df, datecol, timecol):
    return df.apply(lambda row: row[datecol].replace(
                                hour=row[timecol].hour,
                                minute=row[timecol].minute),
                    axis=1)

which in the OP's case would be:

在 OP 的情况下,这将是:

combine_date_time(df, 'Date', 'Time')

I have timed both approaches for a relatively large dataset (>500.000 rows), and they both have similar runtimes, but using combineis faster (59s for replacevs 50s for combine).

我已经为相对较大的数据集(> 500.000 行)计时了这两种方法,它们都有相似的运行时间,但使用combine速度更快(59 秒replacevs 50 秒combine)。

回答by M.K Rana

You can use this to merge date and time into the same column of dataframe.

您可以使用它来将日期和时间合并到数据框的同一列中。

import pandas as pd    
data_file = 'data.csv' #path of your file

Reading .csv file with merged columns Date_Time:

读取具有合并列 Date_Time 的 .csv 文件:

data = pd.read_csv(data_file, parse_dates=[['Date', 'Time']]) 

You can use this line to keep both other columns also.

您也可以使用此行来保留其他两列。

data.set_index(['Date', 'Time'], drop=False)

回答by Chris PERE

You can cast the columns if the types are different (datetime and timestamp or str) and use to_datetime :

如果类型不同(日期时间和时间戳或 str),您可以转换列并使用 to_datetime :

df.loc[:,'Date'] = pd.to_datetime(df.Date.astype(str)+' '+df.Time.astype(str))

Result :

结果 :

0   2013-01-06 23:00:00
1   2013-02-06 01:00:00
2   2013-02-06 21:00:00
3   2013-02-06 22:00:00
4   2013-02-06 23:00:00
5   2013-03-06 01:00:00
6   2013-03-06 21:00:00
7   2013-03-06 22:00:00
8   2013-03-06 23:00:00
9   2013-04-06 01:00:00

Best,

最好的事物,

回答by toto_tico

The answer really depends on what your column types are. In my case, I had datetimeand timedelta.

答案实际上取决于您的列类型是什么。就我而言,我有datetimetimedelta

> df[['Date','Time']].dtypes
Date     datetime64[ns]
Time    timedelta64[ns]

If this is your case, then you just need to add the columns:

如果这是您的情况,那么您只需要添加列:

> df['Date'] + df['Time']

回答by jpp

You can also convert to datetimewithout string concatenation, by combining datetimeand timedeltaobjects. Combined with pd.DataFrame.pop, you can remove the source series simultaneously:

您还可以datetime通过组合datetimetimedelta对象转换为不连接字符串。结合pd.DataFrame.pop,您可以同时删除源系列:

df['DateTime'] = pd.to_datetime(df.pop('Date')) + pd.to_timedelta(df.pop('Time'))

print(df)

             DateTime
0 2013-01-06 23:00:00
1 2013-02-06 01:00:00
2 2013-02-06 21:00:00
3 2013-02-06 22:00:00
4 2013-02-06 23:00:00
5 2013-03-06 01:00:00
6 2013-03-06 21:00:00
7 2013-03-06 22:00:00
8 2013-03-06 23:00:00
9 2013-04-06 01:00:00

print(df.dtypes)

DateTime    datetime64[ns]
dtype: object

回答by queise

First make sure to have the right data types:

首先确保拥有正确的数据类型:

df["Date"] = pd.to_datetime(df["Date"])
df["Time"] = pd.to_timedelta(df["Time"])

Then you easily combine them:

然后您可以轻松地将它们组合起来:

df["DateTime"] = df["Date"] + df["Time"]

回答by Stephen

Use the combinefunction:

使用 combine函数:

datetime.datetime.combine(date, time)

回答by tgbrooks

My dataset had 1second resolution data for a few days and parsing by the suggested methods here was very slow. Instead I used:

我的数据集有几天的 1 秒分辨率数据,这里建议的方法解析非常慢。相反,我使用了:

dates = pandas.to_datetime(df.Date, cache=True)
times = pandas.to_timedelta(df.Time)
datetimes  = dates + times

Note the use of cache=Truemakes parsing the dates very efficient since there are only a couple unique dates in my files, which is not true for a combined date and time column.

请注意,使用cache=True使得解析日期非常有效,因为我的文件中只有几个唯一的日期,这对于组合的日期和时间列而言并非如此。