r - 具有因子重置的 cumsum 的 Dpylr 解决方案
问题描述
我需要一个创建 cumsum 列的 dpylr 解决方案。
# Input dataframe
df <- data.frame(OilChanged = c("No","No","Yes","No","No","No","No","No","No","No","No","Yes","No"),
Odometer = c(300,350,410,420,430,450,500,600,600,600,650,660,700))
# Create difference column - first row starting with zero
df <- df %>% dplyr::mutate(Odometer_delta = Odometer - lag(Odometer, default = Odometer[1]))
我正在尝试根据累积和的因子列设置重置条件。结果需要完全像这样。
# Wanted result dataframe
df <- data.frame(OilChanged = c("No","No","Yes","No","No","No","No","No","No","No","No","Yes","No"),
Odometer = c(300,350,410,420,430,450,500,600,600,600,650,660,700),
Diff = c(0,50,60,10,10,20,50,100,0,0,50,10,40),
CumSum = c(0,50,110,10,20,40,90,190,190,190,240,250,40))
解决方案
您可以每次创建一个新组OilChanged == 'Yes'
并在每个组cumsum
中获取价值。Diff
library(dplyr)
df %>%
group_by(grp = lag(cumsum(OilChanged == 'Yes'), default = 0)) %>%
mutate(newcumsum = cumsum(Diff)) %>%
ungroup %>%
select(-grp)
# OilChanged Odometer Diff CumSum newcumsum
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 No 300 0 0 0
# 2 No 350 50 50 50
# 3 Yes 410 60 110 110
# 4 No 420 10 10 10
# 5 No 430 10 20 20
# 6 No 450 20 40 40
# 7 No 500 50 90 90
# 8 No 600 100 190 190
# 9 No 600 0 190 190
#10 No 600 0 190 190
#11 No 650 50 240 240
#12 Yes 660 10 250 250
#13 No 700 40 40 40
推荐阅读
- android - 为什么从 onTimeSet() 函数中访问 TextView 会产生运行时错误?
- xml - 为什么现代浏览器引擎不支持 XPath 2.0?
- java - 如何在按下键时停止自动点击器
- c - scandir() 的第二个参数是什么?
- codeigniter - CodeIgniter 4 中的 PayPal 交易后无法更新数据库
- rust - 设置文本轮廓颜色
- oop - 评估上下文中 this/self 的命名约定
- python - 如何将函数的结果从一个文件传递到另一个文件
- excel - Excel VBA:在是/否下拉菜单上显示/隐藏图片;同时进行故障排除。可见吗?
- python - 使用函数清除从不同函数创建的 Tkinter 图像