首页 > 解决方案 > 循环遍历数据表列并应用 glm

问题描述

我正在尝试遍历我的数据表列并glm使用 for 循环应用于每一列。然后我想从模型中提取回归系数并将它们添加到我的输出数据表中。

dt是一个数据表,y是一个向量:

output = data.table('reg_coef' = numeric())
for(n in 1:ncol(dt)){
  model = glm(y ~ dt[, n], family=binomial(link="logit"))
  reg_coef = summary(model)$coefficients[2]
  output = rbindlist(list(output, list(reg_coef)))
}

为什么这不起作用?我收到此错误:

Error in `[.data.table`(dt, , n) : 
  j (the 2nd argument inside [...]) is a single symbol but column name 'n' is not found. Perhaps you intended DT[, ..n]. This difference to data.frame is deliberate and explained in FAQ 1.1. 

标签: rfor-loopdatatable

解决方案


您可以应用模型并在同一循环中提取系数。使用lapply

output <- do.call(rbind, lapply(names(dt), function(x) {
  model <- glm(reformulate(x, 'y'), dt, family=binomial(link="logit"))
  summary(model)$coefficients[2]
}))

推荐阅读