首页 > 解决方案 > 以毫秒为单位将时间转换为 POSIX 以进行时间序列分析

问题描述

我正在尝试进行一些时间序列分析,并且正在努力以正确的格式获取我的时间变量。

我已经能够将 Dates、Times 和 Date_Times 作为字符,但我无法将它们识别为 POSIX 时间。

示例数据:

Data <- structure(list(Date = c("2018-05-25", "2018-05-25", "2018-05-25", 
"2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", 
"2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", 
"2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", 
"2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", "2018-05-25", 
"2018-05-25", "2018-05-25"), Time = c("08:36:46.203", "08:36:47.552", 
"08:36:48.222", "08:36:49.429", "08:36:50.409", "08:36:51.471", 
"08:36:52.393", "08:36:53.422", "08:36:54.482", "08:36:55.436", 
"08:36:56.454", "08:36:57.552", "08:36:58.385", "08:36:59.473", 
"08:37:00.368", "08:37:01.477", "08:37:02.399", "08:37:03.596", 
"08:37:04.457", "08:37:05.593", "08:37:06.595", "08:37:07.582", 
"08:37:08.506", "08:37:09.579", "08:37:10.586"), Date_Time = c("2018-05-25 08:36:46.203", 
"2018-05-25 08:36:47.552", "2018-05-25 08:36:48.222", "2018-05-25 08:36:49.429", 
"2018-05-25 08:36:50.409", "2018-05-25 08:36:51.471", "2018-05-25 08:36:52.393", 
"2018-05-25 08:36:53.422", "2018-05-25 08:36:54.482", "2018-05-25 08:36:55.436", 
"2018-05-25 08:36:56.454", "2018-05-25 08:36:57.552", "2018-05-25 08:36:58.385", 
"2018-05-25 08:36:59.473", "2018-05-25 08:37:00.368", "2018-05-25 08:37:01.477", 
"2018-05-25 08:37:02.399", "2018-05-25 08:37:03.596", "2018-05-25 08:37:04.457", 
"2018-05-25 08:37:05.593", "2018-05-25 08:37:06.595", "2018-05-25 08:37:07.582", 
"2018-05-25 08:37:08.506", "2018-05-25 08:37:09.579", "2018-05-25 08:37:10.586"
)), class = "data.frame", row.names = c(NA, -25L))

尝试使用 strptime 转换为 POSIX,留下了 NA

Data$Date_Time <- strptime(Data$Date_Time, "%Y-%m-%d %H:%M:%0S")


         Date         Time               Date_Time POS_Date_Time
1  2018-05-25 08:36:46.203 2018-05-25 08:36:46.203          <NA>
2  2018-05-25 08:36:47.552 2018-05-25 08:36:47.552          <NA>
3  2018-05-25 08:36:48.222 2018-05-25 08:36:48.222          <NA>
4  2018-05-25 08:36:49.429 2018-05-25 08:36:49.429          <NA>
5  2018-05-25 08:36:50.409 2018-05-25 08:36:50.409          <NA>
6  2018-05-25 08:36:51.471 2018-05-25 08:36:51.471          <NA>
7  2018-05-25 08:36:52.393 2018-05-25 08:36:52.393          <NA>
8  2018-05-25 08:36:53.422 2018-05-25 08:36:53.422          <NA>

如何从这些数据中创建可读时间?

标签: rdatetimeposix

解决方案


您不应该使用strptime在数据框中创建列。它会产生一个 POSIXlt 列,使用起来很混乱。而是使用as.POSIXct. 还要注意警告:

Data$Date_Time <- as.POSIXct(Data$Date_Time, "%Y-%m-%d %H:%M:%S")
Warning messages:
1: In strptime(xx, f, tz = tz) : unknown timezone '%Y-%m-%d %H:%M:%S'
2: In as.POSIXct.POSIXlt(x) : unknown timezone '%Y-%m-%d %H:%M:%S'
3: In strptime(x, f, tz = tz) : unknown timezone '%Y-%m-%d %H:%M:%S'
4: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone '%Y-%m-%d %H:%M:%S'
> Data
         Date         Time           Date_Time
1  2018-05-25 08:36:46.203 2018-05-25 08:36:46
2  2018-05-25 08:36:47.552 2018-05-25 08:36:47
3  2018-05-25 08:36:48.222 2018-05-25 08:36:48
4  2018-05-25 08:36:49.429 2018-05-25 08:36:49
5  2018-05-25 08:36:50.409 2018-05-25 08:36:50
6  2018-05-25 08:36:51.471 2018-05-25 08:36:51
7  2018-05-25 08:36:52.393 2018-05-25 08:36:52
8  2018-05-25 08:36:53.422 2018-05-25 08:36:53
9  2018-05-25 08:36:54.482 2018-05-25 08:36:54
10 2018-05-25 08:36:55.436 2018-05-25 08:36:55
11 2018-05-25 08:36:56.454 2018-05-25 08:36:56
12 2018-05-25 08:36:57.552 2018-05-25 08:36:57
13 2018-05-25 08:36:58.385 2018-05-25 08:36:58
14 2018-05-25 08:36:59.473 2018-05-25 08:36:59
15 2018-05-25 08:37:00.368 2018-05-25 08:37:00
16 2018-05-25 08:37:01.477 2018-05-25 08:37:01
17 2018-05-25 08:37:02.399 2018-05-25 08:37:02
18 2018-05-25 08:37:03.596 2018-05-25 08:37:03
19 2018-05-25 08:37:04.457 2018-05-25 08:37:04
20 2018-05-25 08:37:05.593 2018-05-25 08:37:05
21 2018-05-25 08:37:06.595 2018-05-25 08:37:06
22 2018-05-25 08:37:07.582 2018-05-25 08:37:07
23 2018-05-25 08:37:08.506 2018-05-25 08:37:08
24 2018-05-25 08:37:09.579 2018-05-25 08:37:09
25 2018-05-25 08:37:10.586 2018-05-25 08:37:10
Warning message:
In as.POSIXlt.POSIXct(x, tz) : unknown timezone '%Y-%m-%d %H:%M:%S'
> dput(head(Data))
structure(list(Date = c("2018-05-25", "2018-05-25", "2018-05-25", 
"2018-05-25", "2018-05-25", "2018-05-25"), Time = c("08:36:46.203", 
"08:36:47.552", "08:36:48.222", "08:36:49.429", "08:36:50.409", 
"08:36:51.471"), Date_Time = structure(c(1527237406.203, 1527237407.552, 
1527237408.222, 1527237409.429, 1527237410.409, 1527237411.471
), class = c("POSIXct", "POSIXt"), tzone = "%Y-%m-%d %H:%M:%S")), row.names = c(NA, 
6L), class = "data.frame)

我应该给格式字符串一个参数名称:

Data$Date_Time <- as.POSIXct(Data$Date_Time, format="%Y-%m-%d %H:%M:%S")

这是 strptime 版本的结构结果:

dput(head(Data))
structure(list(Date = c("2018-05-25", "2018-05-25", "2018-05-25", 
"2018-05-25", "2018-05-25", "2018-05-25"), Time = c("08:36:46.203", 
"08:36:47.552", "08:36:48.222", "08:36:49.429", "08:36:50.409", 
"08:36:51.471"), Date_Time = structure(list(sec = c(46, 47, 48, 
49, 50, 51), min = c(36L, 36L, 36L, 36L, 36L, 36L), hour = c(8L, 
8L, 8L, 8L, 8L, 8L), mday = c(25L, 25L, 25L, 25L, 25L, 25L), 
    mon = c(4L, 4L, 4L, 4L, 4L, 4L), year = c(118L, 118L, 118L, 
    118L, 118L, 118L), wday = c(5L, 5L, 5L, 5L, 5L, 5L), yday = c(144L, 
    144L, 144L, 144L, 144L, 144L), isdst = c(1L, 1L, 1L, 1L, 
    1L, 1L), zone = c("PDT", "PDT", "PDT", "PDT", "PDT", "PDT"
    ), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
    NA_integer_, NA_integer_)), class = c("POSIXlt", "POSIXt"
))), row.names = c(NA, 6L), class = "data.frame")

推荐阅读