postgresql 提取时间戳的一部分
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15019284/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract parts of timestamp
提问by MortenSickel
I have some datasets where animals have been logged each 5th second. the data are stored in postgis and I am using R to analyse and plot the data. I would like to ad on the plots some markers showing the position for each hour. I am pulling subsets of the dataset into R making a data frame having among other information the timestamp and x and y coordinates for each point.
我有一些数据集,其中每 5 秒记录一次动物。数据存储在 postgis 中,我使用 R 来分析和绘制数据。我想在图表上添加一些标记来显示每小时的位置。我将数据集的子集拉入 R 制作一个数据框,其中包含每个点的时间戳以及 x 和 y 坐标。
a part of summary() of my data set (total approx 10000 rows):
我的数据集的 summary() 的一部分(总共大约 10000 行):
id datetime date
Min. :2664295 Min. :2009-08-21 05:00:04 Min. :2009-08-21
1st Qu.:2666819 1st Qu.:2009-08-21 08:30:15 1st Qu.:2009-08-21
Median :2669342 Median :2009-08-21 12:00:03 Median :2009-08-21
Mean :2669342 Mean :2009-08-21 11:30:10 Mean :2009-08-21
3rd Qu.:2671866 3rd Qu.:2009-08-21 14:22:44 3rd Qu.:2009-08-21
Max. :2674390 Max. :2009-08-21 16:59:58 Max. :2009-08-21
lokalitet cowid x y
Length:10 Min. :553 Min. :455329 Min. :6712350
Class :character 1st Qu.:553 1st Qu.:455718 1st Qu.:6712744
Mode :character Median :553 Median :456154 Median :6713068
Mean :553 Mean :456011 Mean :6713074
3rd Qu.:553 3rd Qu.:456274 3rd Qu.:6713470
Max. :553 Max. :456361 Max. :6713596
So, what I would like to do is to subset the data set on datetime where the minute part is 0 and the second part is < 5 (i.e. for the timestamps shown in the summary I would like to have 05:00:04 and 12:00:03. (I know I can do this in postgres, but as the base dataset is quite large (approx 4 M rows) and I have not made an index on partial timestamps, I think it may be better to do this on the subset in R) (and no, as I want to do this a number of times, I do not want to just manually find the first row that matches and then take each 720th row from there on)
所以,我想要做的是对日期时间的数据集进行子集化,其中分钟部分为 0,第二部分为 < 5(即对于摘要中显示的时间戳,我希望有 05:00:04 和 12 :00:03.(我知道我可以在 postgres 中做到这一点,但是由于基础数据集非常大(大约 4 M 行)而且我没有对部分时间戳进行索引,我认为在R 中的子集)(不,因为我想多次这样做,我不想只是手动找到匹配的第一行,然后从那里取出第 720 行)
回答by juba
With the lubridate
package, it is quite easy to extract minutes and seconds from your dates and then to make a test on them. For example :
使用该lubridate
包,可以很容易地从您的日期中提取分钟和秒,然后对它们进行测试。例如 :
library(lubridate)
##?Sample data
dates <- as.POSIXlt(c("2009-08-21 05:00:04","2009-08-21 08:30:15","2009-08-21 12:00:03","2009-08-21 11:30:10","2009-08-21 14:22:44","2009-08-21 16:59:58"))
## How to extract minutes
minute(dates)
# [1] 0 30 0 30 22 59
## How to extract seconds
second(dates)
# [1] 4 15 3 10 44 58
## Select dates from minutes and seconds values
dates[minute(dates)==0 & second(dates) < 5]
# [1] "2009-08-21 05:00:04" "2009-08-21 12:00:03"
回答by IRTFM
You could make a helper function that takes a format string and converts to numeric. I'm going to assume the datatimes are POSIXct rather than POSIXlt. (POSIXct does not play nicely with dataframe operations.)
您可以创建一个辅助函数,该函数接受格式字符串并转换为数字。我将假设数据时间是 POSIXct 而不是 POSIXlt。(POSIXct 不能很好地处理数据帧操作。)
ndtfrm <- function(dt, frm) as.numeric(format(dt, frm))
ndtfrm(dates, "%M")
#[1] 0 30 0 30 22 59
ndtfrm(dates, "%S")
#[1] 4 15 3 10 44 58
dfrm[ ndtfrm(dfrm$dates, "%M") ==0 & ndtfrm(dfrm$dates, "%S") < 5 , ]
回答by MortenSickel
Thanks for your answers, but it turned out it was easiest to do it in postgres ...
感谢您的回答,但事实证明在 postgres 中最容易做到...
a simple
一个简单的
SELECT ..., case when extract(second from datetime)<5
and extract(minute from datetime) = 28
then extract(hour from datetime) end as marker FROM ....
gave me what I needed - and as I am just fecthing this data toghether with the rest of the data set, it is no need to search for those points.
给了我我需要的 - 因为我只是将这些数据与数据集的其余部分结合在一起,所以不需要搜索这些点。