首页 > 解决方案 > 根据 3 列更改值

问题描述

我的数据框有以下子集

Initial Date        Type    Sub_type
AML 2018-01-02  DV  MR  
AML 2018-01-02  DV  MR_abdo     
DJ  2018-01-02  DV  MR      
DJ  2018-01-02  DV  MR_abdo     
MS  2018-01-02  V2  V2      
MS  2018-01-02  DV  UL      
NK  2018-01-02  DV  Pet_ct      
NK  2018-01-02  DV  CT_dr   
NK  2018-01-03  DV  CT_dr       
NK  2018-01-03  DV  Pet_ct      
PV  2018-01-03  V2  V2      
PV  2018-01-03  DV  UL      
MD  2018-01-04  V2  V2      
MD  2018-01-04  DV  MR      
NQ  2018-01-04  AN_BV   V1      
NQ  2018-01-04  DV  CT_dr       
PS  2018-01-04  DV  Møder       
PS  2018-01-04  DV  Ferie

我正在尝试更改 Type 的值,如果它是相同的 Initial、Date 以及该人在同一日期的 Sub_type 是否为 V2。

因此,以 MS 为例,在 2018 年 1 月 2 日,此人的类型为 V2 和 DV,在子类型中此人分别具有 V2 和 UL。但是,由于此人有一个 Sub_type V2,对于同一日期,我希望将 DV 的 Type 值更改为 V2

希望的输出

Initial Date        Type    Sub_type
AML 2018-01-02  DV  MR  
AML 2018-01-02  DV  MR_abdo     
DJ  2018-01-02  DV  MR      
DJ  2018-01-02  DV  MR_abdo     
MS  2018-01-02  V2  V2      
MS  2018-01-02  V2  UL      
NK  2018-01-02  DV  Pet_ct      
NK  2018-01-02  DV  CT_dr   
NK  2018-01-03  DV  CT_dr       
NK  2018-01-03  DV  Pet_ct      
PV  2018-01-03  V2  V2      
PV  2018-01-03  V2  UL      
MD  2018-01-04  V2  V2      
MD  2018-01-04  V2  MR      
NQ  2018-01-04  AN_BV   V1      
NQ  2018-01-04  DV  CT_dr       
PS  2018-01-04  DV  Møder       
PS  2018-01-04  DV  Ferie

和输入

structure(list(Initial= c("AML", "AML", "DJ", "DJ", "MS", 
"MS", "NK", "NK", "NK", "NK", "PV", "PV", "MD", "MD", "NQ", "NQ", 
"PS", "PS"), Date = c("2018-01-02", "2018-01-02", "2018-01-02", 
"2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02", 
"2018-01-03", "2018-01-03", "2018-01-03", "2018-01-03", "2018-01-04", 
"2018-01-04", "2018-01-04", "2018-01-04", "2018-01-04", "2018-01-04"
), Type= c("DV", "DV", "DV", "DV", "V2", "DV", "DV", "DV", 
"DV", "DV", "V2", "DV", "V2", "DV", "AN_BV", "DV", "DV", "DV"
), Sub_type= c("MR", "MR_abdo", "MR", "MR_abdo", "V2", 
"UL", "Pet_ct", "CT_dr", "CT_dr", "Pet_ct", "V2", "UL", "V2", 
"MR", "V1", "CT_dr", "Møder", "Ferie")), row.names = c(470L, 
585L, 1605L, 1796L, 6081L, 6230L, 6673L, 6710L, 6514L, 6586L, 
7490L, 7658L, 5512L, 5657L, 6968L, 7142L, 7182L, 7296L), class = "data.frame")

标签: rdataframe

解决方案


对于每个组,Initial我们Date检查是否Type == Sub_type并返回Type它们相似的地方。

library(dplyr)

df %>%
  group_by(Initial, Date) %>%
  mutate(Type = if(any(Type == Sub_type)) Type[which.max(Type == Sub_type)] 
                else Type)

#   Initial Date       Type  Sub_type
#   <chr>   <chr>      <chr> <chr>   
# 1 AML     2018-01-02 DV    MR      
# 2 AML     2018-01-02 DV    MR_abdo 
# 3 DJ      2018-01-02 DV    MR      
# 4 DJ      2018-01-02 DV    MR_abdo 
# 5 MS      2018-01-02 V2    V2      
# 6 MS      2018-01-02 V2    UL      
# 7 NK      2018-01-02 DV    Pet_ct  
# 8 NK      2018-01-02 DV    CT_dr   
# 9 NK      2018-01-03 DV    CT_dr   
#10 NK      2018-01-03 DV    Pet_ct  
#11 PV      2018-01-03 V2    V2      
#12 PV      2018-01-03 V2    UL      
#13 MD      2018-01-04 V2    V2      
#14 MD      2018-01-04 V2    MR      
#15 NQ      2018-01-04 AN_BV V1      
#16 NQ      2018-01-04 DV    CT_dr   
#17 PS      2018-01-04 DV    Møder   
#18 PS      2018-01-04 DV    Ferie   

数据

df <- structure(list(Initial = c("AML", "AML", "DJ", "DJ", "MS", "MS", 
"NK", "NK", "NK", "NK", "PV", "PV", "MD", "MD", "NQ", "NQ", "PS", 
"PS"), Date = c("2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02", 
"2018-01-02", "2018-01-02", "2018-01-02", "2018-01-02", "2018-01-03", 
"2018-01-03", "2018-01-03", "2018-01-03", "2018-01-04", "2018-01-04", 
"2018-01-04", "2018-01-04", "2018-01-04", "2018-01-04"), Type = c("DV", 
"DV", "DV", "DV", "V2", "DV", "DV", "DV", "DV", "DV", "V2", "DV", 
"V2", "DV", "AN_BV", "DV", "DV", "DV"), Sub_type = c("MR", "MR_abdo", 
"MR", "MR_abdo", "V2", "UL", "Pet_ct", "CT_dr", "CT_dr", "Pet_ct", 
"V2", "UL", "V2", "MR", "V1", "CT_dr", "Møder", "Ferie")), class = 
"data.frame", row.names = c(NA, -18L))

推荐阅读