首页 > 解决方案 > mlr3 distrcompose cdf:下标越界

问题描述

使用的 R 版本:3.6.3,mlr3 版本:0.4.0-9000,mlr3proba 版本:0.1.6.9000,mlr3pipelines 版本:0.1.2 和 xgboost 版本:0.90.0.2(如 Rstudio 包管理器所述)

我已经部署了以下图形管道:

imputePipe = PipeOpImputeMean$new(id = "imputemean", param_vals = list())
survXGPipe = mlr_pipeops$get("learner",lrn("surv.xgboost"))

graphXG= Graph$new()$
  add_pipeop(imputePipe)$
  add_pipeop(po("learner", lrn("surv.kaplan")))$
  add_pipeop(survXGPipe)$
  add_pipeop(po("distrcompose"))$
  add_edge("imputemean","surv.kaplan")$
  add_edge("imputemean","surv.xgboost")$
  add_edge("surv.kaplan","distrcompose", dst_channel = "base")$
  add_edge("surv.xgboost","distrcompose", dst_channel = "pred")

不幸的是,在执行以下命令时:

lrnXG = GraphLearner$new(graphXG)
trainResults = glrnXG$train(trainVerTask, row_ids = trainDataInd)
predictionResults = glrnXG$predict(trainVerTask, row_ids = verDataInd)

调用 predict 函数时,将返回以下错误:

Error in cdf[i, ] : subscript out of bounds

此错误似乎特定于 distrcompose 函数,因为我尝试仅使用 surv.xgboost、surv.kaplan 实现简单图形,但它没有出现。

它似乎也是数据不明确的,因为我尝试更改输入数据并且只要使用 distrcompose 就会返回相同的错误。如果您希望我提供有关此事的任何进一步信息,请告诉我,提前感谢您的时间。

请使用以下代码重现错误:

library(mlr3)
library(mlr3pipelines)
library(mlr3proba)
library(mlr3learners)
task = tgen("simsurv")$generate(1000)
imputePipe = PipeOpImputeMean$new(id = "imputemean", param_vals = list())
survXGPipe = mlr_pipeops$get("learner",lrn("surv.xgboost"))

graphXG= Graph$new()$
  add_pipeop(imputePipe)$
  add_pipeop(po("learner", lrn("surv.kaplan")))$
  add_pipeop(survXGPipe)$
  add_pipeop(po("distrcompose"))$
  add_edge("imputemean","surv.kaplan")$
  add_edge("imputemean","surv.xgboost")$
  add_edge("surv.kaplan","distrcompose", dst_channel = "base")$
  add_edge("surv.xgboost","distrcompose", dst_channel = "pred")

lrnXG = GraphLearner$new(graphXG)
trainResults = lrnXG$train(task, row_ids = 1:900)
lrnXG$predict(task, row_ids = 901:1000)

标签: rxgboostmlr3

解决方案


问题出在 distr6 这里,请从 CRAN 安装最新版本的 distr6 (1.4.2) 和 mlr3proba (0.2.0) 然后重试。


推荐阅读