首页 > 解决方案 > 如何在超级计算机上正确运行 Caret?

问题描述

超级计算机设置(会话信息)

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /opt/intel/compilers_and_libraries_2019.5.281/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C              LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] caret_6.0-86    lattice_0.20-38 forcats_0.5.0   stringr_1.4.0   dplyr_0.8.5     purrr_0.3.4     readr_1.3.1     tidyr_1.1.0    
 [9] tibble_3.0.1    ggplot2_3.3.0   tidyverse_1.3.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6         lubridate_1.7.8      class_7.3-15         assertthat_0.2.1     ipred_0.9-9          foreach_1.5.0       
 [7] R6_2.4.1             cellranger_1.1.0     plyr_1.8.6           backports_1.1.7      stats4_3.6.3         reprex_0.3.0        
[13] httr_1.4.1           pillar_1.4.4         rlang_0.4.6          readxl_1.3.1         data.table_1.12.8    rstudioapi_0.11     
[19] rpart_4.1-15         Matrix_1.2-18        splines_3.6.3        gower_0.2.1          munsell_0.5.0        broom_0.5.6         
[25] compiler_3.6.3       modelr_0.1.8         pkgconfig_2.0.3      nnet_7.3-12          tidyselect_1.1.0     prodlim_2019.11.13  
[31] codetools_0.2-16     fansi_0.4.1          crayon_1.3.4         dbplyr_1.4.3         withr_2.2.0          ModelMetrics_1.2.2.2
[37] MASS_7.3-51.5        recipes_0.1.12       grid_3.6.3           nlme_3.1-144         jsonlite_1.6.1       gtable_0.3.0        
[43] lifecycle_0.2.0      DBI_1.1.0            magrittr_1.5         pROC_1.16.2          scales_1.1.1         cli_2.0.2           
[49] stringi_1.4.6        reshape2_1.4.4       fs_1.4.1             timeDate_3043.102    xml2_1.3.2           ellipsis_0.3.1      
[55] generics_0.0.2       vctrs_0.3.0          lava_1.6.7           iterators_1.0.12     tools_3.6.3          glue_1.4.1          
[61] hms_0.5.3            survival_3.1-8       colorspace_1.4-1     rvest_0.3.5          haven_2.2.0         

该实例是 0 GPU、64 CPU 和 320 GB 内存。

可重现的例子

# packages
require(tidyverse)
require(caret)
require(parallel)

# nobs
n.obs = 100000
n.vars = 20

# generate data
class.data <- twoClassSim(
  n = n.obs,
  intercept = 0,
  linearVars = n.vars,
  noiseVars  = n.vars,
  corrVars = n.vars,
  ordinal = F
)



# generate Fold
set.seed(1903)
myFolds <- createMultiFolds(
  y = class.data$Class,
  k = 10,
  times = 3
)

# models;
algorithm <- c(
  
  # Regular Tree
  "rpart",
  
  # Random Forest
  "rf",
  
  # Gradient Boosted Machine
  "gbm"
)


# My control Object
myControl <- trainControl(
  index = myFolds,
  method = "repeatedcv",
  allowParallel = T,
  verboseIter = T
  
)


# model formula
model.formula <- as.formula(
  Class ~ .
)



# one model; test ####

# Generate CLuster
cl <- makeCluster(
  spec = 30
  # Was 10; 325
  # Was 20; 304
  # Was 30; 313
)


doParallel::registerDoParallel(
  cl = cl
)

system.time(
  baseline.model <- train(
    form = model.formula,
    data = class.data,
    method = algorithm[2],
    trControl = myControl,
    num.threads = 30
  )
)

stopCluster(
  cl = cl
)

我从来没有任何结果?

截至目前,该算法已经运行了 14 个小时,还没有任何结果。我尝试减少功能,使其仅具有相关功能;这大约有 21 个特征。跑了6个小时后,我也放弃了。

我做错了什么,或者这样的计算机上的算法在出现任何结果之前运行 12 个小时以上是否正常?

您对如何解决此类问题有任何建议,这样我就不会在运行数小时后醒来发现错误?

标签: rparallel-processingr-caret

解决方案


推荐阅读