首页 > 解决方案 > 用于多项分类的 cv.glmnet() 中运行的大型矩阵

问题描述

我正在研究一个样本数为 N=40 和特征 P=7130 的大型矩阵。我正在尝试适应cv.glmnet()山脊,但在执行此操作时出现错误。
数据集的维度为 (40,7130)
cv.glmnet() 的代码如下:

ridge2_cv <- cv.glmnet(x, y,
                   ## type.measure: loss to use for cross-validation.
                   type.measure = "deviance",
                   ## K = 10 is the default.
                   nfold = 10,
                   ## Multinomial regression
                   family = "multinomial",
                   ## ‘alpha = 1’ is the lasso penalty, and ‘alpha = 0’ the ridge penalty.
                   alpha = 0)

x是包含 285160 个元素的大矩阵。y是大小为 40 的多类响应变量,
当我运行上述函数时,我不断收到此错误。

Error in cbind2(1, newx) %*% (nbeta[[i]]) : invalid class 'NA' to dup_mMatrix_as_dgeMatrix In addition: Warning messages: 1: In lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : one multinomial or binomial class has fewer than 8 observations; dangerous ground 2: In lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : one multinomial or binomial class has fewer than 8 observations; dangerous ground

标签: rsparse-matrixlogistic-regressionglmnetmultinomial

解决方案


您可以尝试使用 data.matrix() 作为矩阵而不是 as.matrix 吗?我记得尝试过类似的东西。

ridge2_cv <- cv.glmnet(data.matrix(x), y,
               ## type.measure: loss to use for cross-validation.
               type.measure = "deviance",
               ## K = 10 is the default.
               nfold = 10,
               ## Multinomial regression
               family = "multinomial",
               ## ‘alpha = 1’ is the lasso penalty, and ‘alpha = 0’ the ridge penalty.
               alpha = 0)

推荐阅读