r - 如何使用 LMER 运行大型 data.frame？

问题描述

我正在使用一个包含大约 140 万个观测值的大型 data.frame。最初，当我运行我的模型时，我正在处理一个子样本（我的全样本的 10%）。这是因为使用原始数据运行一个模型可能需要大约两个小时。一旦我确定所有变量都很好地协调并且所有回归运行良好，我就使用完整样本运行我的模型。但是，回归没有收敛，我从两个不同的模型中收到以下两个错误：

Error in fun(xaa, ...) : Downdated VtV is not positive definite

Error in fun(xss, ...) : Downdated VtV is not positive definite

我不确定这是否相关，但我的笔记本电脑规格是：MacBook Pro（Retina，15 英寸，2015 年中）

处理器：2.5 GHz 四核 Intel Core i7

内存：16 GB 1600 MHz DDR3

显卡：英特尔 Iris Pro 1536 MB

数据是分层的，结构如下：个人（1 级）-> country_year（2 级）-> 国家（3 级）。

我使用该lmer函数将这个模型拟合在一起，并在第 2 层和第 3 层包含一个随机斜率。您可以在下面找到可重现的代码。有人可以指导我如何解决这个问题吗？

df <- tibble(
  y = rnorm(1400000),
  x1 = rnorm(1400000),
  x2 = rnorm(1400000),
  country =sample.int(30,size=1400000,replace=TRUE)-1,
  country_year =sample.int(10,size=1400000,replace=TRUE)-1
  )

df$country = as.factor(df$country)
df$country_year = as.factor(df$country_year)

library(lme4)
model1 <- lmer(y~ x1 + x2 +  
              (x1 + x2 |country_year) +
              (x1 + x2 |country), data=df)

标签： rhierarchical-datalarge-datalme4

r - 如何使用 LMER 运行大型 data.frame？

问题描述

解决方案

推荐阅读