r - lapply - 环境中存在命名参数时传递和不传递参数的区别
问题描述
请参阅最后的编辑以获取可重现的示例。
问题描述
当我boot::censboot(data, statistic, parallel = "multicore", ncpus = 2, var = whatEver)
在我定义的地方运行时,statistic <- function(data, var)
我收到类型的错误消息FUN(X[[i]], ...) : unused argument (var = whatEver)
。问题是statistic
无法看到var
.
我打电话时不会发生这种情况boot::censboot(data, statistic, parallel = "no")
。
通过调试我可以看到:
如果
parallel = "no"
,boot::censboot
正在运行这样的事情:stat <- function(r, s){r + s} main <- function(...) { fn <- {function(r) stat(r, ...)} lapply(1:2, fn) } main(s = 2)
输出:
[[1]] [1] 3 [[2]] [1] 4
在这种情况下
stat
确实能够看到s = 1
,即使fn
只是r
(而不是r
AND...
)的函数。但是如果
parallel = "multicore", ncpus = 2
, thenboot::censboot
运行类似这样的东西(请注意,与上述代码块的唯一区别在于...
)lapply
:stat <- function(r, s){r + s} main <- function(...) { fn <- {function(r) stat(r, ...)} lapply(1:2, fn, ...) } main(s = 2)
输出:
Error in FUN(X[[i]], ...) : unused argument (s = 2)
在这种情况下
stat
是看不到的s = 1
。这是未使用参数类型的错误消息的根本原因。(当然,实际上是调用
boot::censboot
而parallel::mclapply
不是lapply
并行化,但问题与....
...
lapply
parallel::mclapply
boot::censboot
问题:
- 为什么会这样?在普通情况下,如何
stat
才能看到实际不使用传递参数的地方?为什么在并行情况下使用时这不再正确?s = 1
lapply
...
lapply
...
- 我无法更改 的内部结构
main
,它表示在boot::censboot
. 我该如何更改stat
以使其在两种情况下都有效?
编辑:添加了可重现的示例
根据下面评论者的要求,这是一个在并行情况下重现错误的示例。如果您parallel = "no", ncpus = 1
在boot::censboot
代码中设置按预期工作。
library(boot)
library(survival)
data(aml, package = "boot")
statMeanSurv <- function(data, var) {
surv <- survfit(Surv(time, cens) ~ 1, data = data)
mean(surv$surv) + var
}
res <- censboot(aml, statMeanSurv, R = 5,
var = 1, parallel = "multicore", ncpus = 2)
res$t
输出:
> res <- censboot(aml, statMeanSurv, R = 5,
+ var = 1, parallel = "multicore", ncpus = 2)
Warning message:
In parallel::mclapply(seq_len(R), fn, ..., mc.cores = ncpus) :
all scheduled cores encountered errors in user code
>
> res$t
[,1]
[1,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[2,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[3,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[4,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
[5,] "Error in FUN(X[[i]], ...) : unused argument (var = 1)\n"
解决方案
This is a rewrite of the original post, that gives a better explanation of what went wrong, and fixes a possible bug in the workaround.
That looks like a bug in censboot
. It doesn't handle the ...
parameter correctly. (More explanation below.) The reason you don't get an error with parallel = 'no'
is that the code follows a different path.
A workaround is to use "partial application" to create a 1-parameter statistic function, like this:
library(boot)
library(survival)
#>
#> Attaching package: 'survival'
#> The following object is masked from 'package:boot':
#>
#> aml
data(aml, package = "boot")
statMeanSurv <- function(data, var) {
surv <- survfit(Surv(time, cens) ~ 1, data = data)
mean(surv$surv) + var
}
statMeanSurv1 <- function(var) {
force(var) # Fix the value of var
function(mean) statMeanSurv(mean, var)
}
res <- censboot(aml, statMeanSurv1(var = 1), R = 5,
parallel = "multicore", ncpus = 2)
res$t
#> [,1]
#> [1,] 1.564580
#> [2,] 1.503473
#> [3,] 1.602111
#> [4,] 1.440942
#> [5,] 1.594482
Created on 2021-02-04 by the reprex package (v0.3.0)
Internally, the problem in censboot
is that it does something like my workaround, but then it also passes ...
to its equivalent of statMeanSurv1
, and that's an error: it can only accept 1 argument.
The line force(var)
in statMeanSurv1
isn't necessary in the example, but in more elaborate examples it might be. It guarantees that the newly created function uses the specified value.
推荐阅读
- python - 从文本文件转换为类对象
- android - 如何通过命令将应用与云端硬盘文件同步?
- c# - 在 C# 的文本文件中发送数组的内容
- python - Django DeleteView __str__ 返回非字符串(类型模型名称)
- c# - 使用权重函数以程序方式生成地形,但它没有给我预期的结果
- javascript - 第一次加载时“$ 未定义”
- python - DLL 加载失败:导入 scipy 时找不到指定的模块
- javascript - jQuery AutoComplete - 如何访问返回值?
- mysql - mysql #1005 错误号 150
- mysql - 组合框未使用 bindingnavigator 记录选择器显示下一条记录