首页 > 解决方案 > R:data.table 使用 for 循环来处理多列

问题描述

我目前正在 R 中构建一个 for 循环,它将年份添加到包含部分日期 (dd/mm) 的 7 列中。我一直在尝试运行以下 for 循环,但没有成功。我究竟做错了什么?

这是我的数据集的示例(实际数据集包括 HomDate - HomDate_7 列,但我只包括前几个,因为我知道你会明白这一点......)

    Participant  DateVisit  HomDate  HomDate_2  HomeDate_3  year_flag
    1            2012-04-25 18/04    19/04      20/04       NA
    2            2012-01-04 28/12    29/12      30/12       1
    3            2012-01-05 31/12    01/01      01/02       1
    4            2012-06-13 06/06    07/06      08/06       NA
    5            2012-02-12 05/02    06/02      07/02       NA

这是我一直在尝试使用的代码:

   hom_date <- list("HomDate", "HomDate_2", "HomDate_3", "HomDate_4", "HomDate_5", "HomDate_6",         
   "HomDate_7")
   set_dates <- function(x){
   home_morbid[,x:=as.character(x)]
   home_morbid[(substr(x, 4, 5)==12) & (year_flag==1), x:=paste(x, "/2011", sep="")]
   home_morbid[(substr(x, 4, 5)==01) & (year_flag==1), x:=paste(x, "/2012", sep="")]
   home_morbid[is.na(year_flag), x:=paste(x, "/", substr(DateVisit, 1, 4), sep="")]
    }

   for(i in 1:length(hom_date)){
     x <- hom_date[i]
     home_morbid_2<-set_dates(x)
    }

标签: rdata.table

解决方案


我不确定那些有NA旗帜的人会发生什么。这是一种方法:

    to_replace<-grep("^Hom",names(df))
df[,(to_replace):=lapply(.SD, function(x) ifelse(is.na(year_flag),x,
       ifelse(substr(x, 4, 5)==12,
                               paste0(x,"/","2011"),
                                         paste0(x,"/","2012")))),
    .SDcols=HomDate:HomeDate_3][]
   Participant  DateVisit    HomDate  HomDate_2 HomeDate_3 year_flag
1:           1 2012-04-25      18/04      19/04      20/04        NA
2:           2 2012-01-04 28/12/2011 29/12/2011 30/12/2011         1
3:           3 2012-01-05 31/12/2011 01/01/2012 01/02/2012         1
4:           4 2012-06-13      06/06      07/06      08/06        NA
5:           5 2012-02-12      05/02      06/02      07/02        NA

用来自的年份替换NA标记的年份DateVisit

 library(lubridate)
 to_replace<-grep("^Hom",names(df))
 df[,(to_replace):=lapply(.SD, function(x) ifelse(is.na(year_flag),
                             paste0(x,"/",year(ymd(DateVisit))),
                                            ifelse(substr(x, 4, 5)==12,
                                            paste0(x,"/","2011"),
                                                paste0(x,"/","2012")))),
   .SDcols=HomDate:HomeDate_3][]
   Participant  DateVisit    HomDate  HomDate_2 HomeDate_3 year_flag
1:           1 2012-04-25 18/04/2012 19/04/2012 20/04/2012        NA
2:           2 2012-01-04 28/12/2011 29/12/2011 30/12/2011         1
3:           3 2012-01-05 31/12/2011 01/01/2012 01/02/2012         1
4:           4 2012-06-13 06/06/2012 07/06/2012 08/06/2012        NA
5:           5 2012-02-12 05/02/2012 06/02/2012 07/02/2012        NA

推荐阅读