首页 > 解决方案 > R函数改变多列

问题描述

我正在处理药物注册数据。我想计算诊断前开具的口服药物剂量。在示例数据中,dia_date 代表诊断日期,ddd 代表剂量。

df1 <- tribble(
  ~id,  ~drug_group,    ~drug_type, ~prescribed_date,   ~dia_date,  ~ddd,
  1,    "A",    "oral",     2010,   2020,   1,  
  1,    "B",    "non-oral", 2011,   2020,   2,  
  2,    "A",    "oral",     2019,   2020,   1,  
  2,    "B",    "oral",     2019,   2020,   1,  
  2,    "C",    "oral",     2008,   2021,   2,  
  3,    "A",    "oral",     2021,   2020,   2,  
  3,    "C",    "non-oral", 2009,   2021,   2,  
  4,    "A",    "oral",     2010,   2020,   NA )

输出应该像


df2 <- tribble(
~id,    ~drug_group,    ~drug_type, ~prescribed_date,   ~dia_date,  ~ddd,   ~ddd_a, ~ddd_b, ~ddd_c,
1,  "A",    "oral",     2010,   2020,   1,  1,  0,  0,
1,  "B",    "non-oral", 2011,   2020,   2,  0,  0,  0,
2,  "A",    "oral",     2019,   2020,   1,  1,  0,  0,
2,  "B",    "oral",     2019,   2020,   1,  0,  1,  0,
2,  "C",    "oral",     2008,   2021,   2,  0,  0,  2,
3,  "A",    "oral",     2021,   2020,   2,  0,  0,  0,
3,  "C",    "non-oral", 2009,   2021,   2,  0,  0,  0,
4,  "A",    "oral",     2010,   2020,   NA, 0,  0,  0 )

在实际数据集中,药物组超过 20 个。我尝试使用以下代码,但无济于事。


##Attempt1
 for (col in c("a","b","c")){
  ddd_= paste0("ddd_",col)
  df1[,ddd_] = df1$ddd
}

for (i in c("ddd_a","ddd_b","ddd_c")){
  if (df1$prescribed_date>df1$dia_date & df1$drug_group!="oral"){
    df1[,i] <- 0
  }
}

##Attempt2
for (col in c("a","b","c")){
  ddd_= paste0("ddd_",col)
  df1[,ddd_] = df1$ddd
}
f <- function (x) ifelse(df1$prescribed_date>df1$dia_date & df1$drug_group!="oral",0,x)
df1 %>% mutate(across(starts_with("ddd_")), f)

如果有任何帮助,我将不胜感激。

标签: rfor-loopdplyr

解决方案


使用reshape2's dcast

df1 %>%
  dcast(id+drug_group+drug_type+prescribed_date+dia_date+ddd ~ drug_group, length) %>%
  mutate_at(.funs = list(ddd = ~.*ddd*(drug_type == "oral")*(prescribed_date <= dia_date)), .vars = vars(A:C)) %>%
  select(-c(A:C))

产生:

  id drug_group drug_type prescribed_date dia_date ddd A_ddd B_ddd C_ddd
1  1          A      oral            2010     2020   1     1     0     0
2  1          B  non-oral            2011     2020   2     0     0     0
3  2          A      oral            2019     2020   1     1     0     0
4  2          B      oral            2019     2020   1     0     1     0
5  2          C      oral            2008     2021   2     0     0     2
6  3          A      oral            2021     2020   2     0     0     0
7  3          C  non-oral            2009     2021   2     0     0     0
8  4          A      oral            2010     2020  NA    NA    NA    NA

推荐阅读