首页 > 解决方案 > R 中的蒙特卡洛自举功率分析仿真需要多长时间?可能是几个小时?(1000 次重复,1000 次引导)

问题描述

我正在使用蒙特卡罗模拟来对纵向中介模型进行功率分析。我正在使用 bmem 包 (lavaan) 中的 power.boot 函数。

我只用 5 reps/5 bootstrap 检查了代码,以确保它有效并且确实有效。

然后我按照包文档的建议以 1000 次重复和 1000 次引导运行代码。

现在已经一个多小时了,它还在运行——这正常吗?多长时间才算过长?

powermodel1 <-'

x2 ~ start(.6)*x1 + x*x1 
x3 ~ start(.6)*x2 + x*x2

m2 ~ start(.15)*x1 + a*x1 + start(.3)*m1 + m*m1 
m3 ~ start(.15)*x2 + a*x2 + start(.3)*m2 + m*m2

y2 ~ start(.5)*m1 + b*m2 + start(.3)*y1 + y*y1
y3 ~ start(.5)*m2 + b*m2 + start(.3)*y2 + y*y2 + start(0.05)*x1 + c*x1

x1 ~~ start(.15)*m1

x1 ~~ start(.15)*y1


y1 ~~ start(.5)*m1

'
indirect <- 'ab:=a*b'

N<-200

system.time(bootstrap<-power.boot(powermodel1, indirect, N, nrep=1000, nboot=1000, parallel = 'multicore'))

            summary(bootstrap)

标签: rsimulationmontecarlostatistics-bootstrappower-analysis

解决方案


不幸的是,它看起来需要一段时间。在我的系统上约 8 小时:

library(bmem)

powermodel1 <-'

x2 ~ start(.6)*x1 + x*x1 
x3 ~ start(.6)*x2 + x*x2

m2 ~ start(.15)*x1 + a*x1 + start(.3)*m1 + m*m1 
m3 ~ start(.15)*x2 + a*x2 + start(.3)*m2 + m*m2

y2 ~ start(.5)*m1 + b*m2 + start(.3)*y1 + y*y1
y3 ~ start(.5)*m2 + b*m2 + start(.3)*y2 + y*y2 + start(0.05)*x1 + c*x1

x1 ~~ start(.15)*m1

x1 ~~ start(.15)*y1


y1 ~~ start(.5)*m1

'
indirect <- 'ab:=a*b'

N<-200

system.time(bootstrap<-bmem::power.boot(powermodel1, indirect, N, nrep = 10, nboot = 10, parallel = 'multicore'))
system.time(bootstrap<-bmem::power.boot(powermodel1, indirect, N, nrep = 30, nboot = 30, parallel = 'multicore'))
system.time(bootstrap<-bmem::power.boot(powermodel1, indirect, N, nrep = 60, nboot = 60, parallel = 'multicore'))
system.time(bootstrap<-bmem::power.boot(powermodel1, indirect, N, nrep = 100, nboot = 100, parallel = 'multicore'))

library(tidyverse)
# Load the times from above into a dataframe
benchmark <- tibble(bootstraps = c(10, 30, 60, 100),
                    times = c(4.021, 30.122, 121.103, 311.236)) 

# Plot the points and fit a curve
ggplot(benchmark, aes(x = bootstraps, y = times)) +
  geom_point() +
  geom_smooth(se = FALSE, span = 5)

例子.png

# Fit a model
fit <- lm(data = benchmark, times~poly(bootstraps,
                                       2, raw=TRUE))
newtimes <- data.frame(bootstraps = seq(100, 1000, length = 4))

# Predict the time it will take for larger bootstrap/rep values
predict(fit, newdata = newtimes)
>        1          2          3          4 
>  311.6829  4568.3812 13789.6754 27975.5655 

# Convert from seconds to hours
print(27975.5655/60/60)
>[1] 7.77099

推荐阅读