使用 python pandas 合并日期和时间列

Question

提问by richie

I have a pandas dataframe with the following columns;

我有一个包含以下列的熊猫数据框；

Date              Time
01-06-2013      23:00:00
02-06-2013      01:00:00
02-06-2013      21:00:00
02-06-2013      22:00:00
02-06-2013      23:00:00
03-06-2013      01:00:00
03-06-2013      21:00:00
03-06-2013      22:00:00
03-06-2013      23:00:00
04-06-2013      01:00:00

How do I combine data['Date'] & data['Time'] to get the following? Is there a way of doing it using pd.to_datetime?

如何组合 data['Date'] 和 data['Time'] 以获得以下内容？有没有办法做到这一点pd.to_datetime？

Date
01-06-2013 23:00:00
02-06-2013 01:00:00
02-06-2013 21:00:00
02-06-2013 22:00:00
02-06-2013 23:00:00
03-06-2013 01:00:00
03-06-2013 21:00:00
03-06-2013 22:00:00
03-06-2013 23:00:00
04-06-2013 01:00:00

Answer 1

采纳答案by Andy Hayden

It's worth mentioning that you may have been able to read this in directlye.g. if you were using read_csvusing parse_dates=[['Date', 'Time']].

值得一提的是，您可能已经能够直接阅读此内容，例如，如果您使用read_csvusing parse_dates=[['Date', 'Time']].

Assuming these are just strings you could simply add them together (with a space), allowing you to apply to_datetime:

假设这些只是字符串，您可以简单地将它们添加在一起（带空格），允许您应用to_datetime：

In [11]: df['Date'] + ' ' + df['Time']
Out[11]:
0    01-06-2013 23:00:00
1    02-06-2013 01:00:00
2    02-06-2013 21:00:00
3    02-06-2013 22:00:00
4    02-06-2013 23:00:00
5    03-06-2013 01:00:00
6    03-06-2013 21:00:00
7    03-06-2013 22:00:00
8    03-06-2013 23:00:00
9    04-06-2013 01:00:00
dtype: object

In [12]: pd.to_datetime(df['Date'] + ' ' + df['Time'])
Out[12]:
0   2013-01-06 23:00:00
1   2013-02-06 01:00:00
2   2013-02-06 21:00:00
3   2013-02-06 22:00:00
4   2013-02-06 23:00:00
5   2013-03-06 01:00:00
6   2013-03-06 21:00:00
7   2013-03-06 22:00:00
8   2013-03-06 23:00:00
9   2013-04-06 01:00:00
dtype: datetime64[ns]

Note: surprisingly (for me), this works fine with NaNs being converted to NaT, but it is worth worrying that the conversion (perhaps using the raiseargument).

注意：令人惊讶的是（对我而言），这在将 NaN 转换为 NaT 时效果很好，但值得担心的是转换（可能使用raise参数）。

Answer 2

回答by jka.ne

The accepted answer works for columns that are of datatype string. For completeness: I come across this question when searching how to do this when the columns are of datatypes: date and time.

接受的答案适用于数据类型的列string。为了完整起见：当列的数据类型为日期和时间时，我在搜索如何执行此操作时遇到了这个问题。

df.apply(lambda r : pd.datetime.combine(r['date_column_name'],r['time_column_name']),1)

Answer 3

回答by jabellcu

I don't have enough reputation to comment on jka.neso:

我没有足够的声誉来评论jka.ne所以：

I had to amend jka.ne's linefor it to work:

我不得不修改jka.ne 的行才能让它工作：

df.apply(lambda r : pd.datetime.combine(r['date_column_name'],r['time_column_name']).time(),1)

This might help others.

这可能会帮助其他人。

Also, I have tested a different approach, using replaceinstead of combine:

另外，我测试了一种不同的方法，使用replace代替combine：

def combine_date_time(df, datecol, timecol):
    return df.apply(lambda row: row[datecol].replace(
                                hour=row[timecol].hour,
                                minute=row[timecol].minute),
                    axis=1)

which in the OP's case would be:

在 OP 的情况下，这将是：

combine_date_time(df, 'Date', 'Time')

I have timed both approaches for a relatively large dataset (>500.000 rows), and they both have similar runtimes, but using combineis faster (59s for replacevs 50s for combine).

我已经为相对较大的数据集（> 500.000 行）计时了这两种方法，它们都有相似的运行时间，但使用combine速度更快（59 秒replacevs 50 秒combine）。

Answer 4

回答by M.K Rana

You can use this to merge date and time into the same column of dataframe.

您可以使用它来将日期和时间合并到数据框的同一列中。

import pandas as pd    
data_file = 'data.csv' #path of your file

Reading .csv file with merged columns Date_Time:

读取具有合并列 Date_Time 的 .csv 文件：

data = pd.read_csv(data_file, parse_dates=[['Date', 'Time']])

You can use this line to keep both other columns also.

您也可以使用此行来保留其他两列。

data.set_index(['Date', 'Time'], drop=False)

Answer 5

回答by Chris PERE

You can cast the columns if the types are different (datetime and timestamp or str) and use to_datetime :

如果类型不同（日期时间和时间戳或 str），您可以转换列并使用 to_datetime ：

df.loc[:,'Date'] = pd.to_datetime(df.Date.astype(str)+' '+df.Time.astype(str))

Result :

结果：

0   2013-01-06 23:00:00
1   2013-02-06 01:00:00
2   2013-02-06 21:00:00
3   2013-02-06 22:00:00
4   2013-02-06 23:00:00
5   2013-03-06 01:00:00
6   2013-03-06 21:00:00
7   2013-03-06 22:00:00
8   2013-03-06 23:00:00
9   2013-04-06 01:00:00

Best,

最好的事物，

Answer 6

回答by toto_tico

The answer really depends on what your column types are. In my case, I had datetimeand timedelta.

答案实际上取决于您的列类型是什么。就我而言，我有datetime和timedelta。

> df[['Date','Time']].dtypes
Date     datetime64[ns]
Time    timedelta64[ns]

If this is your case, then you just need to add the columns:

如果这是您的情况，那么您只需要添加列：

> df['Date'] + df['Time']

Answer 7

回答by jpp

You can also convert to datetimewithout string concatenation, by combining datetimeand timedeltaobjects. Combined with pd.DataFrame.pop, you can remove the source series simultaneously:

您还可以datetime通过组合datetime和timedelta对象转换为不连接字符串。结合pd.DataFrame.pop，您可以同时删除源系列：

df['DateTime'] = pd.to_datetime(df.pop('Date')) + pd.to_timedelta(df.pop('Time'))

print(df)

             DateTime
0 2013-01-06 23:00:00
1 2013-02-06 01:00:00
2 2013-02-06 21:00:00
3 2013-02-06 22:00:00
4 2013-02-06 23:00:00
5 2013-03-06 01:00:00
6 2013-03-06 21:00:00
7 2013-03-06 22:00:00
8 2013-03-06 23:00:00
9 2013-04-06 01:00:00

print(df.dtypes)

DateTime    datetime64[ns]
dtype: object

Answer 8

回答by queise

First make sure to have the right data types:

首先确保拥有正确的数据类型：

df["Date"] = pd.to_datetime(df["Date"])
df["Time"] = pd.to_timedelta(df["Time"])

Then you easily combine them:

然后您可以轻松地将它们组合起来：

df["DateTime"] = df["Date"] + df["Time"]

Answer 9

回答by Stephen

Use the combinefunction:

使用 combine函数：

datetime.datetime.combine(date, time)

Answer 10

回答by tgbrooks

My dataset had 1second resolution data for a few days and parsing by the suggested methods here was very slow. Instead I used:

我的数据集有几天的 1 秒分辨率数据，这里建议的方法解析非常慢。相反，我使用了：

dates = pandas.to_datetime(df.Date, cache=True)
times = pandas.to_timedelta(df.Time)
datetimes  = dates + times

Note the use of cache=Truemakes parsing the dates very efficient since there are only a couple unique dates in my files, which is not true for a combined date and time column.

请注意，使用cache=True使得解析日期非常有效，因为我的文件中只有几个唯一的日期，这对于组合的日期和时间列而言并非如此。

使用 python pandas 合并日期和时间列

提问by richie

采纳答案by Andy Hayden

回答by jka.ne

回答by jabellcu

回答by M.K Rana

回答by Chris PERE

回答by toto_tico

回答by jpp

回答by queise

回答by Stephen

回答by tgbrooks

相关推荐

最近更新

标签

使用 python pandas 合并日期和时间列

提问by richie

采纳答案by Andy Hayden

回答by jka.ne

回答by jabellcu

回答by M.K Rana

回答by Chris PERE

回答by toto_tico

回答by jpp

回答by queise

回答by Stephen

回答by tgbrooks

相关推荐

python - unix 系统中的 getmtime() 和 getctime() 的区别

Python 从熊猫的日期时间列中减去一年

python if和else语句计算员工工资

Python Windows 上的 pip 安装访问被拒绝

相关推荐

最近更新

标签