r - 如何更改相对于另一列和组的列
问题描述
我有 2 列
PERNO TPURP loop
1 Loop trip 1
1 Loop trip 2
1 home 2
1 shopping 2
2 work 1
2 Loop trip 2
2 school 2
3 Looptrip 1
4 work 1
对于每个 perno 如果 TPURP== Loop trip 我想在该行之后添加 1 到循环。
对于每个 PERNO,如果 Loop 行程恰好在另一个 Loop 行程的下一行,我们不会将 1 添加到第一个但我们会为第二个。
输出
PERNO TPURP loop
1 Loop trip 1
1 Loop trip 2
1 home 3
1 shopping 3
2 work 1
2 Loop trip 2
2 school 3
3 Looptrip 1
4 work 1
数据
structure(list(PERNO = c(1, 1, 1, 1, 1, 1), TPURP = structure(c(8L,
1L, 22L, 22L, 9L, 2L), .Label = c("(1) Working at home (for pay)",
"(2) All other home activities", "(3) Work/Job", "(4) All other activities at work",
"(5) Attending class", "(6) All other activities at school",
"(7) Change type of transportation/transfer", "(8) Dropped off passenger",
"(9) Picked up passenger", "(10) Other, specify - transportation",
"(11) Work/Business related", "(12) Service Private Vehicle",
"(13) Routine Shopping", "(14) Shopping for major purchases",
"(15) Household errands", "(16) Personal Business", "(17) Eat meal outside of home",
"(18) Health care", "(19) Civic/Religious activities", "(20) Recreation/Entertainment",
"(21) Visit friends/relative", "(24) Loop trip", "(97) Other, specify"
), class = "factor"), loop = c(1, 1, 2, 2, 2, 2)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L))
解决方案
使用dplyr
,我们可以在组中最后一次出现 之后group_by
PERNO
增加 的值。loop
"Loop trip"
library(dplyr)
df %>%
group_by(PERNO) %>%
mutate(loop1 = ifelse(any(TPURP == "Loop trip") &
row_number() > max(which(TPURP == "Loop trip")),loop + 1, loop))
# PERNO TPURP loop loop1
# <int> <fct> <int> <dbl>
#1 1 Loop trip 1 1
#2 1 Loop trip 2 2
#3 1 home 2 3
#4 1 shopping 2 3
#5 2 work 1 1
#6 2 Loop trip 2 2
#7 2 school 2 3
#8 3 Looptrip 1 1
#9 4 work 1 1
如果任何组没有"Loop trip"
但可以忽略,这将返回一条警告消息。
数据
df <- structure(list(PERNO = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 4L),
TPURP = structure(c(2L, 2L, 1L, 5L, 6L, 2L, 4L, 3L, 6L), .Label = c("home",
"Loop trip", "Looptrip", "school", "shopping", "work"), class = "factor"),
loop = c(1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L)), class = "data.frame",
row.names = c(NA, -9L))
或者我们可以使用grepl
/grep
来部分匹配而不是 @Sotos 提到的完全匹配。在更新的数据集上,我们可以做
df %>%
group_by(PERNO) %>%
dplyr::mutate(loop1 = ifelse(any(grepl('Loop', TPURP)) &
row_number() > max(grep('Loop', TPURP)), loop + 1, loop))
# PERNO TPURP loop loop1
# <dbl> <fct> <dbl> <dbl>
#1 1 (8) Dropped off passenger 1 1
#2 1 (1) Working at home (for pay) 1 1
#3 1 (24) Loop trip 2 2
#4 1 (24) Loop trip 2 2
#5 1 (9) Picked up passenger 2 3
#6 1 (2) All other home activities 2 3
推荐阅读
- java - 我不能停止计时器,它不会停止重复自己
- java - java 8 写 utf-8 编码乱码
- spring-boot - Spring Reactor Web 客户端用例。用 WebClient 替换 RestTemplate
- android - 在本地网络上使用 Google 的 Python-ADB
- macos - Visual Studio for Mac - 自 mfractor 以来,自动完成功能不再起作用
- ibm-midrange - CPD3213 错误是否与 AS400 版本有关?
- html - 如何翻译具有风格换行符的文本/HTML?
- javascript - Chrome:Javascript 在“this”上下文中用 `window` 替换 `undefined`
- php - Assign a value to Codeigniter URI segment
- testing - 关于Botium支持的说明