首页 > 解决方案 > 将 cumsum 重置为变量中的值

问题描述

我有一个简单的数据集,带有datesampleapplication。每天,应用的值都sample应该减少1和增加。当我们取另一个时sample,我们从新值重新开始。结果在变量中给出estimate。我正在尝试计算中的变量estimateR

这是我的数据集:

structure(
    list(
        date = structure(c(17591, 17592, 17593, 17594, 
          17595, 17596, 17597, 17598, 17599, 17600, 17601, 17602, 17603, 
          17604, 17605, 17606, 17607, 17608, 17609, 17610, 17611, 17612, 
          17613, 17614), class = "Date"),
        sample = c(30, NA, NA, NA, NA, 20, NA, NA, 40, NA, NA, 
          NA, NA, NA, 35, NA, NA, 15, NA, NA, NA, 
          NA, 5, NA), 
        applied = c(NA, NA, 10, NA, NA, NA, NA, 10, NA, NA, 
          15, NA, NA, NA, 10, NA, NA, NA, 15, NA, NA, 10, NA, NA), 
        estimate = c(30, 29, 38, 37, 36, 20, 19, 28, 40, 39, 53, 52, 
          51, 50, 35, 34, 33, 15, 29, 28, 27, 36, 5, 4)
        ), 
    class = "data.frame", row.names = c(NA, -24L)
)

标签: rcumsum

解决方案


这可能会稍微清理一下,但它可以工作:

library(dplyr)
df %>% 
  mutate(group = cumsum(!is.na(sample))) %>%
  group_by(group) %>%
  mutate(
    to_add = ifelse(!is.na(sample), 0, ifelse(is.na(applied), -1, applied - 1)),
    result = first(sample) + cumsum(to_add)
  )
# # A tibble: 24 x 7
# # Groups:   group [6]
#    date       sample applied estimate group to_add result
#    <date>      <dbl>   <dbl>    <dbl> <int>  <dbl>  <dbl>
#  1 2018-03-01     30      NA       30     1      0     30
#  2 2018-03-02     NA      NA       29     1     -1     29
#  3 2018-03-03     NA      10       38     1      9     38
#  4 2018-03-04     NA      NA       37     1     -1     37
#  5 2018-03-05     NA      NA       36     1     -1     36
#  6 2018-03-06     20      NA       20     2      0     20
#  7 2018-03-07     NA      NA       19     2     -1     19
#  8 2018-03-08     NA      10       28     2      9     28
#  9 2018-03-09     40      NA       40     3      0     40
# 10 2018-03-10     NA      NA       39     3     -1     39
# # ... with 14 more rows

推荐阅读