首页 > 解决方案 > 使用已知信息按顺序执行计算并在 R 中按组填充未来值

问题描述

我的数据如下所示: intial<-tibble(start_date=rep(seq.Date(as.Date("2021-06-01"),as.Date("2021-10-01"),by="months"),4),end_date=rep(seq.Date(as.Date("2021-07-01"),as.Date("2021-11-01"),by="months"),4),id=rep(c(rep(1,5),rep(2,5)),2),group=c(rep("a",10),rep("b",10)),increase=c(c(4:8),c(5:9),c(6:10),c(7:11)),decrease=c(c(1:5),c(2:6),c(3:7),c(4:8)),start_count=c(c(10,13,16,19,NA),c(15,18,21,24,NA),c(20,23,26,29,NA),c(20,23,26,29,NA)),end_count=c(c(13,16,19,NA,NA),c(18,21,24,NA,NA),c(23,26,29,NA,NA),c(23,26,29,NA,NA))) print(initial)

  start_date end_date      id group increase decrease start_count end_count
   <date>     <date>     <dbl> <chr>    <int>    <int>       <dbl>     <dbl>
 1 2021-06-01 2021-07-01     1 a            4        1          10        13
 2 2021-07-01 2021-08-01     1 a            5        2          13        16
 3 2021-08-01 2021-09-01     1 a            6        3          16        19
 4 2021-09-01 2021-10-01     1 a            7        4          19        NA
 5 2021-10-01 2021-11-01     1 a            8        5          NA        NA
 6 2021-06-01 2021-07-01     2 a            5        2          15        18
 7 2021-07-01 2021-08-01     2 a            6        3          18        21
 8 2021-08-01 2021-09-01     2 a            7        4          21        24
 9 2021-09-01 2021-10-01     2 a            8        5          24        NA
10 2021-10-01 2021-11-01     2 a            9        6          NA        NA
11 2021-06-01 2021-07-01     1 b            6        3          20        23
12 2021-07-01 2021-08-01     1 b            7        4          23        26
13 2021-08-01 2021-09-01     1 b            8        5          26        29
14 2021-09-01 2021-10-01     1 b            9        6          29        NA
15 2021-10-01 2021-11-01     1 b           10        7          NA        NA
16 2021-06-01 2021-07-01     2 b            7        4          20        23
17 2021-07-01 2021-08-01     2 b            8        5          23        26
18 2021-08-01 2021-09-01     2 b            9        6          26        29
19 2021-09-01 2021-10-01     2 b           10        7          29        NA
20 2021-10-01 2021-11-01     2 b           11        8          NA        NA

其中 和 的每个唯一组合都id包含group一个日期索引。我需要使用increase和字段中的值来计算和decrease顺序填充每个start_count和组合end_count中的每个日期。idgroup

end_count计算方式为:start_count+ increase- decrease,并且start_count是前一个日期的end_count

基本上我想最终得到如下所示的东西example: final<-tibble(start_date=rep(seq.Date(as.Date("2021-06-01"),as.Date("2021-10-01"),by="months"),4),end_date=rep(seq.Date(as.Date("2021-07-01"),as.Date("2021-11-01"),by="months"),4),id=rep(c(rep(1,5),rep(2,5)),2),group=c(rep("a",10),rep("b",10)),increase=c(c(4:8),c(5:9),c(6:10),c(7:11)),decrease=c(c(1:5),c(2:6),c(3:7),c(4:8)),start_count=c(c(10,13,16,19,21),c(15,18,21,24,27),c(20,23,26,29,32),c(20,23,26,29,31)),end_count=c(c(13,16,19,21,24),c(18,21,24,27,30),c(23,26,29,32,35),c(23,26,29,31,34)))

print(final)
 start_date end_date      id group increase decrease start_count end_count
   <date>     <date>     <dbl> <chr>    <int>    <int>       <dbl>     <dbl>
 1 2021-06-01 2021-07-01     1 a            4        1          10        13
 2 2021-07-01 2021-08-01     1 a            5        2          13        16
 3 2021-08-01 2021-09-01     1 a            6        3          16        19
 4 2021-09-01 2021-10-01     1 a            7        4          19        21
 5 2021-10-01 2021-11-01     1 a            8        5          21        24
 6 2021-06-01 2021-07-01     2 a            5        2          15        18
 7 2021-07-01 2021-08-01     2 a            6        3          18        21
 8 2021-08-01 2021-09-01     2 a            7        4          21        24
 9 2021-09-01 2021-10-01     2 a            8        5          24        27
10 2021-10-01 2021-11-01     2 a            9        6          27        30
11 2021-06-01 2021-07-01     1 b            6        3          20        23
12 2021-07-01 2021-08-01     1 b            7        4          23        26
13 2021-08-01 2021-09-01     1 b            8        5          26        29
14 2021-09-01 2021-10-01     1 b            9        6          29        32
15 2021-10-01 2021-11-01     1 b           10        7          32        35
16 2021-06-01 2021-07-01     2 b            7        4          20        23
17 2021-07-01 2021-08-01     2 b            8        5          23        26
18 2021-08-01 2021-09-01     2 b            9        6          26        29
19 2021-09-01 2021-10-01     2 b           10        7          29        31
20 2021-10-01 2021-11-01     2 b           11        8          31        34

标签: rgroup-bytime-seriessequential

解决方案


我们可以使用

library(dplyr)
library(zoo)
intial %>% 
   group_by(id, group) %>% 
   mutate(start_count_new = coalesce(start_count, 
       zoo::na.locf0(start_count) + increase - decrease), 
      end_count_new = coalesce(lead(start_count_new), 
        lead(start_count_new, default = last(start_count_new)) + increase - decrease))

推荐阅读