首页 > 解决方案 > 基于 R 中组内的列创建新值

问题描述

如果组内两个日期之间的间隔(bankAcctID)相同,则创建两个日期之间的新列差异;否则,创造一个NA价值。

数据

structure(list(bankAcctID = c(439940L, 439940L, 439940L, 439940L, 439940L, 
439940L, 535211L, 535211L, 535211L, 535211L), date = structure(c(18334, 
18347, 18348, 18362, 18369, 18376, 18331, 18341, 18347, 18355 ), class = 
"Date")), row.names = c(NA, -10L), class = c("grouped_df", "tbl_df", "tbl", 
"data.frame"), groups = structure(list(bankAcctID = c(439940L, 535211L), 
.rows = list(1:6, 7:10)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE))

标签: r

解决方案


很难准确地说出您的理想输出是什么,但这里有一个可能的解决方案:

df %>%
  group_by(bankAcctID) %>%
  mutate(dummy = date - lag(date)) %>%
  mutate(diff = ifelse(dummy == lag(dummy), dummy/2, NA))

dummy包含变量以说明逻辑,您可以通过添加行将其删除%>% select(-dummy)


推荐阅读