Python 熊猫:从时间戳中提取日期和时间

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39662149/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 22:33:01  来源:igfitidea点击:

pandas: extract date and time from timestamp

pythonpython-2.7pandastime-series

提问by chintan s

I have a timestampcolumn where the timestamp is in the following format

我有一timestamp列时间戳采用以下格式

2016-06-16T21:35:17.098+01:00

I want to extract date and time from it. I have done the following:

我想从中提取日期和时间。我做了以下工作:

import datetime as dt

df['timestamp'] = df['timestamp'].apply(lambda x : pd.to_datetime(str(x)))

df['dates'] = df['timestamp'].dt.date

This worked for a while. But suddenly it does not.

这工作了一段时间。但突然之间就没有了。

If I again do df['dates'] = df['timestamp'].dt.dateI get the following error

如果我再次这样做,df['dates'] = df['timestamp'].dt.date我会收到以下错误

Can only use .dt accessor with datetimelike values

Luckily, I have saved the data frame with datesin the csv but I now want to create another column timein the format 23:00:00.051

幸运的是,我已将数据框保存dates在 csv 中,但我现在想以time该格式创建另一列23:00:00.051

EDIT

编辑

From the raw data file (15 million samples), the timestampcolumn looks like following (first 5 samples):

从原始数据文件(1500 万个样本)中,该timestamp列如下所示(前 5 个样本):

            timestamp

0           2016-06-13T00:00:00.051+01:00
1           2016-06-13T00:00:00.718+01:00
2           2016-06-13T00:00:00.985+01:00
3           2016-06-13T00:00:02.431+01:00
4           2016-06-13T00:00:02.737+01:00

After the following command

执行以下命令后

df['timestamp'] = df['timestamp'].apply(lambda x : pd.to_datetime(str(x)))

the timestampcolumn looks like with dtypeas dtype: datetime64[ns]

timestamp列看起来像dtypedtype: datetime64[ns]

0    2016-06-12 23:00:00.051
1    2016-06-12 23:00:00.718
2    2016-06-12 23:00:00.985
3    2016-06-12 23:00:02.431
4    2016-06-12 23:00:02.737

Then finally

然后最后

df['dates'] = df['timestamp'].dt.date

0           2016-06-12
1           2016-06-12
2           2016-06-12
3           2016-06-12
4           2016-06-12

EDIT 2

编辑 2

Found the mistake. I had cleaned the data and saved the data frame in a csv file, so I don't have to do the cleaning again. When I read the csv, the timestamp dtypechanges to object. Now how do I fix this?

发现错误。我已经清理了数据并将数据框保存在一个 csv 文件中,所以我不必再次进行清理。当我读取 csv 时,时间戳dtype更改为 object。现在我该如何解决这个问题?

回答by Ajay Goyal

If date is in string form then:

如果日期是字符串形式,则:

import datetime

# this line converts the string object in Timestamp object
df['DateTime'] = [datetime.datetime.strptime(d, "%Y-%m-%d %H:%M") for d in df["DateTime"]]

# extracting date from timestamp
df['Date'] = [datetime.datetime.date(d) for d in df['DateTime']] 

# extracting time from timestamp
df['Time'] = [datetime.datetime.time(d) for d in df['DateTime']] 

If the object is already in the Timestamp format then skip the first line of code.

如果对象已经是时间戳格式,则跳过第一行代码。

%Y-%m-%d %H:%Mthis means your timestamp object must be in the form like 2016-05-16 12:35:00.

%Y-%m-%d %H:%M这意味着您的时间戳对象必须采用类似 2016-05-16 12:35:00.

回答by Gursel Karacor

Do this first:

先这样做:

df['time'] = pd.to_datetime(df['timestamp'])

Before you do your extraction as usual:

在像往常一样进行提取之前:

df['dates'] = df['time'].dt.date