首页 > 解决方案 > 从后验预测分布抽样(stan vs inla)

问题描述

我正在尝试从bayesplot一个对象的包中实现函数,INLA并且有点不确定如何从后验预测分布中提取。我想我几乎拥有它,但rstan 平局比平局更具可INLA变性。

rstan中,使用bayesplot 小插图中的简化示例,我可以:

library(bayesplot)
library(ggplot2)
library(rstanarm)
library(ggpubr)
library(tidyverse)


#rstan model set up
roaches$roach100 <- roaches$roach1 / 100 # pre-treatment number of roaches (in 100s)
fit_poisson <- stan_glm(y ~ roach100 + treatment + senior, offset = log(exposure2), family = poisson(link = "log"), data = roaches,  seed = 1111,  refresh = 0) 


#In order to use the PPC functions from the bayesplot package we need a vector y of outcome values:
y <- roaches$y
#and a matrix yrep of draws from the posterior predictive distribution,
yrep_poisson <- posterior_predict(fit_poisson, draws = 500)
#then plot:
p1 <- bayesplot::ppc_dens_overlay(y, yrep_poisson[1:50, ])
p1

在此处输入图像描述

我想在一个INLA对象上复制该图。根据bayesplot小插图,您可以这样做,因为他们提供了代码来定义一个简单的pp_check方法来创建类的拟合模型对象,例如foo

pp_check.foo <- function(object, type = c("multiple", "overlaid"), ...) {
  type <- match.arg(type)
  y <- object[["y"]]
  yrep <- object[["yrep"]]
  stopifnot(nrow(yrep) >= 50)
  samp <- sample(nrow(yrep), size = ifelse(type == "overlaid", 50, 5))
  yrep <- yrep[samp, ]
  
  if (type == "overlaid") {
    ppc_dens_overlay(y, yrep, ...) 
  } else {
    ppc_hist(y, yrep, ...)
  }
}

要使用pp_check.foo,我们可以用yyrep组件创建一个列表,并给它类 foo:

x <- list(y = rnorm(200), yrep = matrix(rnorm(1e5), nrow = 500, ncol = 200))
class(x) <- "foo"
#create plot above:
pp_check(x, type = "overlaid")

英拉

#create same model but in inla:
library(INLA)
fit_poisson_inla <- inla(y ~ roach100 + treatment + senior, offset = log(exposure2), data = roaches,
           control.predictor = list(compute = T),
           family = "poisson")

inla_object_name$marginals.fitted.values返回每个 的后验预测分布y

fit_poisson_inla$marginals.fitted.values
#so to get distribution for first oberservation:
fitted.Predictor.1 <- fit_poisson_inla$marginals.fitted.values[[1]]

我认为从这个重复采样会给我我需要的东西,但只有 75 个值(dim(fitted.Predictor.1)每个观察值用于创建这个分布,而实际上我想从全范围的值中采样。我认为我们可以做到这一点(部分4.3这里)通过inla.tmarginal使用线性预测器:

fitted_dist <- fit_poisson_inla$marginals.linear.predictor
#should i have used "inla.rmarginal(n, marginal)"?
marginal_dist <- lapply(fitted_dist, function(y) inla.tmarginal(function(x) {exp(x)}, y)) %>% map(~ as.data.frame(.) %>%  rename(., xx = x))
#resample 500 times
yrep_poisson_inla <- as.matrix(bind_rows(rerun(500, lapply(marginal_dist, function(x) sample(x$xx, 1)) %>% as.data.frame())))

#convert to class foo for pp_check
x <- list(y = y, yrep = yrep_poisson_inla[1:50, ])
class(x) <- "foo"
p2 <- pp_check(x, type = "overlaid")
#plot
ggarrange(p1, p2, ncol = 1, nrow = 2, labels = c("rstan", "inla sample"))

在此处输入图像描述

我的问题是我如何正确地从这个inla( fit_poisson_inla) 对象的后验预测分布中获取一个绘制矩阵以传递到pp_check yrep_poisson产生离散值,同时yrep_poisson_inla产生连续值。rstan平局的变化比INLA(第二个情节)要多得多。我所做的是否正确,这只是一些抽样问题,还是不同方法的产物?在更复杂的示例中,差异可能很大。

谢谢

标签: rmixed-modelsrstanrstanarmr-inla

解决方案


推荐阅读