r - 根据 3 列更改值
问题描述
我的数据框有以下子集
Initial Date Type Sub_type
AML 2018-01-02 DV MR
AML 2018-01-02 DV MR_abdo
DJ 2018-01-02 DV MR
DJ 2018-01-02 DV MR_abdo
MS 2018-01-02 V2 V2
MS 2018-01-02 DV UL
NK 2018-01-02 DV Pet_ct
NK 2018-01-02 DV CT_dr
NK 2018-01-03 DV CT_dr
NK 2018-01-03 DV Pet_ct
PV 2018-01-03 V2 V2
PV 2018-01-03 DV UL
MD 2018-01-04 V2 V2
MD 2018-01-04 DV MR
NQ 2018-01-04 AN_BV V1
NQ 2018-01-04 DV CT_dr
PS 2018-01-04 DV Møder
PS 2018-01-04 DV Ferie
我正在尝试更改 Type 的值,如果它是相同的 Initial、Date 以及该人在同一日期的 Sub_type 是否为 V2。
因此,以 MS 为例,在 2018 年 1 月 2 日,此人的类型为 V2 和 DV,在子类型中此人分别具有 V2 和 UL。但是,由于此人有一个 Sub_type V2,对于同一日期,我希望将 DV 的 Type 值更改为 V2
希望的输出
Initial Date Type Sub_type
AML 2018-01-02 DV MR
AML 2018-01-02 DV MR_abdo
DJ 2018-01-02 DV MR
DJ 2018-01-02 DV MR_abdo
MS 2018-01-02 V2 V2
MS 2018-01-02 V2 UL
NK 2018-01-02 DV Pet_ct
NK 2018-01-02 DV CT_dr
NK 2018-01-03 DV CT_dr
NK 2018-01-03 DV Pet_ct
PV 2018-01-03 V2 V2
PV 2018-01-03 V2 UL
MD 2018-01-04 V2 V2
MD 2018-01-04 V2 MR
NQ 2018-01-04 AN_BV V1
NQ 2018-01-04 DV CT_dr
PS 2018-01-04 DV Møder
PS 2018-01-04 DV Ferie
和输入
structure(list(Initial= c("AML", "AML", "DJ", "DJ", "MS",
"MS", "NK", "NK", "NK", "NK", "PV", "PV", "MD", "MD", "NQ", "NQ",
"PS", "PS"), Date = c("2018-01-02", "2018-01-02", "2018-01-02",
"2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02",
"2018-01-03", "2018-01-03", "2018-01-03", "2018-01-03", "2018-01-04",
"2018-01-04", "2018-01-04", "2018-01-04", "2018-01-04", "2018-01-04"
), Type= c("DV", "DV", "DV", "DV", "V2", "DV", "DV", "DV",
"DV", "DV", "V2", "DV", "V2", "DV", "AN_BV", "DV", "DV", "DV"
), Sub_type= c("MR", "MR_abdo", "MR", "MR_abdo", "V2",
"UL", "Pet_ct", "CT_dr", "CT_dr", "Pet_ct", "V2", "UL", "V2",
"MR", "V1", "CT_dr", "Møder", "Ferie")), row.names = c(470L,
585L, 1605L, 1796L, 6081L, 6230L, 6673L, 6710L, 6514L, 6586L,
7490L, 7658L, 5512L, 5657L, 6968L, 7142L, 7182L, 7296L), class = "data.frame")
解决方案
对于每个组,Initial
我们Date
检查是否Type == Sub_type
并返回Type
它们相似的地方。
library(dplyr)
df %>%
group_by(Initial, Date) %>%
mutate(Type = if(any(Type == Sub_type)) Type[which.max(Type == Sub_type)]
else Type)
# Initial Date Type Sub_type
# <chr> <chr> <chr> <chr>
# 1 AML 2018-01-02 DV MR
# 2 AML 2018-01-02 DV MR_abdo
# 3 DJ 2018-01-02 DV MR
# 4 DJ 2018-01-02 DV MR_abdo
# 5 MS 2018-01-02 V2 V2
# 6 MS 2018-01-02 V2 UL
# 7 NK 2018-01-02 DV Pet_ct
# 8 NK 2018-01-02 DV CT_dr
# 9 NK 2018-01-03 DV CT_dr
#10 NK 2018-01-03 DV Pet_ct
#11 PV 2018-01-03 V2 V2
#12 PV 2018-01-03 V2 UL
#13 MD 2018-01-04 V2 V2
#14 MD 2018-01-04 V2 MR
#15 NQ 2018-01-04 AN_BV V1
#16 NQ 2018-01-04 DV CT_dr
#17 PS 2018-01-04 DV Møder
#18 PS 2018-01-04 DV Ferie
数据
df <- structure(list(Initial = c("AML", "AML", "DJ", "DJ", "MS", "MS",
"NK", "NK", "NK", "NK", "PV", "PV", "MD", "MD", "NQ", "NQ", "PS",
"PS"), Date = c("2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02",
"2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02", "2018-01-03",
"2018-01-03", "2018-01-03", "2018-01-03", "2018-01-04", "2018-01-04",
"2018-01-04", "2018-01-04", "2018-01-04", "2018-01-04"), Type = c("DV",
"DV", "DV", "DV", "V2", "DV", "DV", "DV", "DV", "DV", "V2", "DV",
"V2", "DV", "AN_BV", "DV", "DV", "DV"), Sub_type = c("MR", "MR_abdo",
"MR", "MR_abdo", "V2", "UL", "Pet_ct", "CT_dr", "CT_dr", "Pet_ct",
"V2", "UL", "V2", "MR", "V1", "CT_dr", "Møder", "Ferie")), class =
"data.frame", row.names = c(NA, -18L))
推荐阅读
- php - 脚本 selenium python 总是被检测为 flashseats 上的 bot :(
- c# - WPF:将 TextBlock 与顶部的不同字体大小对齐
- c# - 如何使用 MongoDB C# 客户端驱动程序将复杂的嵌套 JSON 数据结构保存到 MongoDB?
- r - 避免重新计算时数据表变灰
- caching - Service Worker 未创建 networkFirst Cache
- rest - Magento 2.2.7 REST API 发布图像失败并显示 500
- python - 在 Python 中遍历 datetime.datetime 数组?
- birt - 多次请求 BIRT XML 数据源 url
- postman - Linkedin api v2 获取技能:'没有足够的权限访问 GET / 技能'
- sql - SQL - 使用日期表计算员工最后工作的天数?