首页 > 解决方案 > 在 R 中扩展时间序列

问题描述

我有以下示例数据:

name <- c("Alpha","Beta")
numerical_ID <- c(1,5)
first_date <- c("2019-01-28","2017-07-16")
last_date <- c("2019-07-19",  "2020-07-14")
interval_calendar_days <- c(30,180)
sample.data <- data.frame(name,numerical_ID,first_date,last_date,interval_calendar_days)

这意味着我有一笔交易从 first_date 开始,每 x 个日历日发生一次(其中 x = interval_calendar_days),并在 last_date 结束。变量 name 和 numberical_ID 是此事务每次出现的特征。

我想创建以下时间序列,但我不确定如何:

      name    numerical_ID date        
 [1,] "Alpha" "1"          "2019-01-28"
 [2,] "Alpha" "1"          "2019-02-27"
 [3,] "Alpha" "1"          "2019-03-29"
 [4,] "Alpha" "1"          "2019-04-28"
 [5,] "Alpha" "1"          "2019-05-28"
 [6,] "Alpha" "1"          "2019-06-27"
 [7,] "Alpha" "1"          "2019-07-19"
 [8,] "Beta"  "5"          "2017-07-16"
 [9,] "Beta"  "5"          "2018-01-12"
[10,] "Beta"  "5"          "2018-07-11"
[11,] "Beta"  "5"          "2019-01-07"
[12,] "Beta"  "5"          "2019-07-06"
[13,] "Beta"  "5"          "2020-01-02"
[14,] "Beta"  "5"          "2020-06-30"
[15,] "Beta"  "5"          "2020-07-14"

标签: r

解决方案


一个选项是首先将“日期”列转换为Date类,然后使用pmap,创建由“interval_calendar_days”列中的间隔和输出seq指定的从“first_date”到“last_date”的日期的影响unnestlist

library(tidyverse)
library(lubridate)
sample.data %>%
     mutate_at(vars(matches("date")), ymd) %>% 
     transmute(name, numerical_ID, date = pmap(select(., 
           first_date, last_date, interval_calendar_days), ~ 
                  c(seq(..1, ..2, by = ..3), ..2))) %>%
     unnest
# A tibble: 15 x 3
#   name  numerical_ID date      
#   <fct>        <dbl> <date>    
# 1 Alpha            1 2019-01-28
# 2 Alpha            1 2019-02-27
# 3 Alpha            1 2019-03-29
# 4 Alpha            1 2019-04-28
# 5 Alpha            1 2019-05-28
# 6 Alpha            1 2019-06-27
# 7 Alpha            1 2019-07-19
# 8 Beta             5 2017-07-16
# 9 Beta             5 2018-01-12
#10 Beta             5 2018-07-11
#11 Beta             5 2019-01-07
#12 Beta             5 2019-07-06
#13 Beta             5 2020-01-02
#14 Beta             5 2020-06-30
#15 Beta             5 2020-07-14

它也可以通过base R使用来完成Map

lst1 <- do.call(Map, c(f = function(x, y, z) 
     c(seq(as.Date(x), as.Date(y), by = z),
        as.Date(y)), unname(sample.data[-(1:2)])))
out <-  sample.data[1:2][rep(seq_len(nrow(sample.data)), lengths(lst1)),]

推荐阅读