首页 > 解决方案 > R- Include starting point in cumsum function

问题描述

I have this data.frame:

       a b
 [1,]  1 0
 [2,]  2 0
 [3,]  3 0
 [4,]  4 0
 [5,]  5 0
 [6,]  6 1
 [7,]  7 2
 [8,]  8 3
 [9,]  9 4
[10,] 10 5

I want to apply cumsum on column a only when its corresponding value on column b is different from 0.

I tried this below but it doesn't include a starting condition on the cumsum:

   df_cumsum <- cbind(c(1:10), c(0,0,0,0,0,1,2,3,4,5),
                       as.data.frame(ave(A[,1], A[,2] != 0, FUN=cumsum)))

Unfortunately, I obtain a cumsum over the whole column:

    a b  c
1   1 0  1
2   2 0  3
3   3 0  6
4   4 0 10
5   5 0 15
6   6 1  6
7   7 2 13
8   8 3 21
9   9 4 30
10 10 5 40

I would like to obtain:

    a b  c
1   1 0  0
2   2 0  0
3   3 0  0
4   4 0  0
5   5 0  0
6   6 1  6
7   7 2 13
8   8 3 21
9   9 4 30
10 10 5 40

Thanks for help!

标签: rdataframecumsummultiple-conditions

解决方案


假设输入df在最后的注释中可重现,试试这个。它将任何a值为b0 的值清零。

transform(df, cum = cumsum((b > 0) * a))

给予:

    a b cum
1   1 0   0
2   2 0   0
3   3 0   0
4   4 0   0
5   5 0   0
6   6 1   6
7   7 2  13
8   8 3  21
9   9 4  30
10 10 5  40

笔记

我们假设此输入以可重现的形式显示:

Lines <- "
  a b
  1 0
  2 0
  3 0
  4 0
  5 0
  6 1
  7 2
  8 3
  9 4
 10 5"
df <- read.table(text = Lines, header = TRUE)

更新

a并且b被逆转了。修好了。


推荐阅读