首页 > 解决方案 > 如何使用 R 中的复制函数设置种子以存档模拟结果的重现性

问题描述

我编写了一个R函数,并且kth希望多次运行它,并且在相同的环境中运行相同的任何时候都希望得到相同的结果。我想设置种子,但无法获得相同的结果,因为我连续两次使用相同的种子运行相同的功能。

## Load packages and prepare multicore process
library(forecast)
library(future.apply)
plan(multisession)
library(parallel)
library(foreach)
library(doParallel)
n_cores <- detectCores()
cl <- makeCluster(n_cores)
registerDoParallel(cores = n_cores)
bootstrap1 <- function(n, phi){
  ts <- arima.sim(n, model = list(ar=phi, order = c(1, 1, 0)), sd = 1)
  #ts <- numeric(n)
  #ts[1] <- rnorm(1)
  #for(i in 2:length(ts))
  #  ts[i] <- 2 * ts[i - 1] + rnorm(1)
  ########################################################
  ## create a vector of block sizes
  t <- length(ts)    # the length of the time series
  lb <- seq(n-2)+1   # vector of block sizes to be 1 < l < n (i.e to be between 1 and n exclusively)
  ########################################################
  ## This section create matrix to store block means
  BOOTSTRAP <- matrix(nrow = 1, ncol = length(lb))
  colnames(BOOTSTRAP) <-lb
  #BOOTSTRAP <- list(length(lb))
  ########################################################
  ## This section use foreach function to do detail in the brace
  BOOTSTRAP <- foreach(b = 1:length(lb), .combine = 'cbind') %dopar%{
    l <- lb[b]# block size at each instance 
    m <- ceiling(t / l)                                 # number of blocks
    blk <- split(ts, rep(1:m, each=l, length.out = t))  # divides the series into blocks
    ######################################################
    res<-sample(blk, replace=T, 1000)        # resamples the blocks
    res.unlist <- unlist(res, use.names = FALSE)   # unlist the bootstrap series
    train <- head(res.unlist, round(length(res.unlist) - 10)) # Train set
    test <- tail(res.unlist, length(res.unlist) - length(train)) # Test set
    nfuture <- forecast::forecast(train, model = forecast::auto.arima(train), lambda=0, biasadj=TRUE, h = length(test))$mean        # makes the `forecast of test set
    RMSE <- Metrics::rmse(test, nfuture)      # RETURN RMSE
    BOOTSTRAP[b] <- RMSE
  }
  BOOTSTRAPS <- matrix(BOOTSTRAP, nrow = 1, ncol = length(lb))
  colnames(BOOTSTRAPS) <- lb
  BOOTSTRAPS
  return(list("BOOTSTRAPS" = BOOTSTRAPS))
}

初审

set.seed(123, kind = "L'Ecuyer-CMRG")
t(replicate(3, bootstrap1(10, 0.5)$BOOTSTRAPS[1,]))
#            2        3        4        5        6        7        8        9
#[1,] 3.353364 4.097191 3.759332 3.713234 4.541143 4.151920 4.603380 5.237056
#[2,] 4.490765 5.037171 4.289265 3.964172 3.225878 5.345506 4.646740 2.593153
#[3,] 4.514881 4.838114 3.701961 5.069747 4.165742 4.130256 3.951216 4.133241

二审

set.seed(123, kind = "L'Ecuyer-CMRG")
t(replicate(3, bootstrap1(10, 0.5)$BOOTSTRAPS[1,]))
#            2        3        4        5        6        7        8        9
#[1,] 3.271285 3.701031 2.725770 3.867532 3.283368 3.713057 3.274201 4.141896
#[2,] 3.987040 3.767720 5.440987 3.850190 3.306520 5.399880 5.337676 3.288834
#[3,] 5.157924 3.895024 3.996077 4.855608 4.443317 5.224098 5.335144 2.918870

我如何设置种子或如何获得相同的结果?

编辑

R在 Windows 上运行。

标签: rwindows

解决方案


你应该只设置一次种子。似乎您设置了两次种子(一次set.seed(1)在您的引导函数内部, 另一次在您的引导函数set.seed(123, kind = "L'Ecuyer-CMRG")外部。

尝试只使用一个set.seed()函数(并且两次都使用相同的值),看看是否可以解决它。


推荐阅读