首页 > 解决方案 > lapply - 环境中存在命名参数时传递和不传递参数的区别

问题描述

请参阅最后的编辑以获取可重现的示例。

问题描述

当我boot::censboot(data, statistic, parallel = "multicore", ncpus = 2, var = whatEver)在我定义的地方运行时,statistic <- function(data, var)我收到类型的错误消息FUN(X[[i]], ...) : unused argument (var = whatEver)。问题是statistic无法看到var.

我打电话时不会发生这种情况boot::censboot(data, statistic, parallel = "no")

通过调试我可以看到:

问题:

  1. 为什么会这样?在普通情况下,如何stat才能看到实际不使用传递参数的地方?为什么在并行情况下使用时这不再正确?s = 1lapply...lapply...
  2. 我无法更改 的内部结构main,它表示在boot::censboot. 我该如何更改stat以使其在两种情况下都有效?

编辑:添加了可重现的示例

根据下面评论者的要求,这是一个在并行情况下重现错误的示例。如果您parallel = "no", ncpus = 1boot::censboot代码中设置按预期工作。

library(boot)
library(survival)
data(aml, package = "boot") 

statMeanSurv <- function(data, var) {
  surv <- survfit(Surv(time, cens) ~ 1, data = data)
  mean(surv$surv) + var
}

res <- censboot(aml, statMeanSurv, R = 5,
                var = 1, parallel = "multicore", ncpus = 2)

res$t

输出:

> res <- censboot(aml, statMeanSurv, R = 5,
+                 var = 1, parallel = "multicore", ncpus = 2)
Warning message:
In parallel::mclapply(seq_len(R), fn, ..., mc.cores = ncpus) :
  all scheduled cores encountered errors in user code
> 
> res$t
     [,1]                                                     
[1,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[2,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[3,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[4,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[5,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"

标签: r

解决方案


This is a rewrite of the original post, that gives a better explanation of what went wrong, and fixes a possible bug in the workaround.

That looks like a bug in censboot. It doesn't handle the ... parameter correctly. (More explanation below.) The reason you don't get an error with parallel = 'no' is that the code follows a different path.

A workaround is to use "partial application" to create a 1-parameter statistic function, like this:

library(boot)
library(survival)
#> 
#> Attaching package: 'survival'
#> The following object is masked from 'package:boot':
#> 
#>     aml
data(aml, package = "boot") 

statMeanSurv <- function(data, var) {
  surv <- survfit(Surv(time, cens) ~ 1, data = data)
  mean(surv$surv) + var
}

statMeanSurv1 <- function(var) { 
  force(var)   # Fix the value of var
  function(mean) statMeanSurv(mean, var) 
}

res <- censboot(aml, statMeanSurv1(var = 1), R = 5,
                parallel = "multicore", ncpus = 2)

res$t
#>          [,1]
#> [1,] 1.564580
#> [2,] 1.503473
#> [3,] 1.602111
#> [4,] 1.440942
#> [5,] 1.594482

Created on 2021-02-04 by the reprex package (v0.3.0)

Internally, the problem in censboot is that it does something like my workaround, but then it also passes ... to its equivalent of statMeanSurv1, and that's an error: it can only accept 1 argument.

The line force(var) in statMeanSurv1 isn't necessary in the example, but in more elaborate examples it might be. It guarantees that the newly created function uses the specified value.


推荐阅读