首页 > 解决方案 > R data.table 如何概括循环中的多个连接?

问题描述

在我在这里获得了许多帮助之后,我成功地运行了这些下一个特定的连接。一步一步,我正在更新我的主要 DT :

DT1 <- data.table(crit = rep(c('AA', 'BB', 'CC', 'DD'),each = 3),
                  num = rep(1:3, 4), 
                  val = rnorm(12)^2)
DT1

DT2 <- data.table(BB = c(1,3),
                  cross = c(128, 183))
DT2

DT3 <- data.table(DD = c(2,3),
                  cross = c(99, 787))
DT3

DT1[DT2[,  c(.(crit = 'BB'), .SD)] , cross := ifelse(is.na(cross), i.cross, cross), on = .(crit, num = BB)]
DT1[DT3[,  c(.(crit = 'DD'), .SD)] , cross := ifelse(is.na(cross), i.cross, cross), on = .(crit, num = DD)]

但是,我需要循环,通过mapply我认为。就像是 :

mapply(fun.join, DTmain = DT1, DTsec = DT2, MoreArgs = list('BB'))
mapply(fun.join, DTmain = DT1, DTsec = DT3, MoreArgs = list('DD'))

但我似乎无法编写正确的函数 fun.join。

谢谢你的帮助 !

标签: rjoindata.tableapplymapply

解决方案


你可以试试下面的代码

lapply(
  list(DT2, DT3),
  function(dt) {
    dt[
      ,
      c(stack(.SD[, 1]), .(cross = cross))
    ][
      DT1,
      on = .(ind = crit, values = num)
    ]
  }
)

这使

[[1]]
    values ind cross       val
 1:      1  AA    NA 0.1287103
 2:      2  AA    NA 2.0288966
 3:      3  AA    NA 0.8914414
 4:      1  BB   128 0.6451096
 5:      2  BB    NA 0.8424112
 6:      3  BB   183 0.3420138
 7:      1  CC    NA 0.4047142
 8:      2  CC    NA 0.7423724
 9:      3  CC    NA 1.3762432
10:      1  DD    NA 0.1086974
11:      2  DD    NA 6.0831923
12:      3  DD    NA 0.5619010

[[2]]
    values ind cross       val
 1:      1  AA    NA 0.1287103
 2:      2  AA    NA 2.0288966
 3:      3  AA    NA 0.8914414
 4:      1  BB    NA 0.6451096
 5:      2  BB    NA 0.8424112
 6:      3  BB    NA 0.3420138
 7:      1  CC    NA 0.4047142
 8:      2  CC    NA 0.7423724
 9:      3  CC    NA 1.3762432
10:      1  DD    NA 0.1086974
11:      2  DD    99 6.0831923
12:      3  DD   787 0.5619010

推荐阅读