首页 > 解决方案 > MCMCglmm with genomic relatedness matrix: "levels do not have a row entry in ginverse"

问题描述

I am trying to estimate heritability of a quantitative trait using the animal model. Because I am studying a wild species of rodent, I lack pedigree information; instead, I am using a genomic relatedness matrix (GRM) using ~23,000 SNPs derived from ddRAD and estimated with PLINK/GCTA.

Although I have cross-checked the individuals in my GRM and my data file, I receive this error:

Error in MCMCglmm(trait ~ 1, random = ~ID, data = phens, ginverse = list(ID = gctaGRM),  : 
  some levels of ID do not have a row entry in ginverse

This problem is similar to one described previously: MCMCglmm data format difficulties. In this case, the poster used a pedigree.

My data are as follows:

  1. Data file with individual id, site, and phenotypic data, "phens"
> class(phens)
[1] "data.frame"

> head(phens)
    Site      ID       trait
1   QERC SGTm037  0.94770905
2 Robles SGTm038 -0.53410457
3   QERC SGTm039  0.35680224
4   QERC SGTm040 -0.24319776
5     MH SGTm041  0.02952266
6     MH SGTm043  2.25680224
> 
  1. Trait as a variable
> trait <- phens$trait # quantitative trait

> head(trait)
[1]  0.94770905 -0.53410457  0.35680224 -0.24319776  0.02952266  2.25680224
  1. GRM from either PLINK or GCTA, read in using gap package: https://github.com/cran/gap/blob/master/R/MCMCgrm.R
gctaGRM <- ReadGRM('gcta-GRM') # reading in GCTA grm
plinkGRM <- ReadGRMPLINK('plinkgenome') # reading in PLINK PIHAT grm

> head(gctaGRM)

          SGTm037     SGTm038     SGTm039
SGTm037 1.0487960 0.000000000 0.000000000
SGTm038 0.0000000 1.065967000 0.009045295
SGTm039 0.0000000 0.009045295 1.032933000
SGTm040 0.1843918 0.000000000 0.019915070
SGTm041 0.0000000 0.031226350 0.000000000
SGTm043 0.0000000 0.038495810 0.000000000

> head(plinkGRM)

        SGTm037 SGTm038 SGTm039 SGTm040 SGTm041
SGTm037  1.0000       0       0  0.1345       0
SGTm038  0.0000       1       0  0.0000       0
SGTm039  0.0000       0       1  0.0000       0
SGTm040  0.1345       0       0  1.0000       0
SGTm041  0.0000       0       0  0.0000       1
SGTm043  0.0000       0       0  0.0000       0

As far as I can tell, the list of IDs in phens agrees with the columns and rows of IDs in either GRM. I can provide the full data files for those who are curious.

> head(phens$ID)
[1] SGTm037 SGTm038 SGTm039 SGTm040 SGTm041 SGTm043

# also played with changing phens$ID from factor to character; made no difference

colnames(gctaGRM)
[1] "SGTm037" "SGTm038" "SGTm039" "SGTm040" "SGTm041" "SGTm043"

This must be a formatting issue, and I spent a lot of time researching this on stack, google groups, and r documentation to no avail. Unfortunately, few resources cover how to implement GRMs in these models (though see https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0197720). I tried changing classes of my data file and trait, but this did not work.

Any suggestions would be greatly appreciated!

标签: rmodelingglmmcmc

解决方案


我收到了同样的错误,和你一样,我很困惑,因为 GRM 的行名确实与我的数据中的内容匹配。

就我而言,我通过使用标准数据框而不是 tidyverse tibble 来解决这个data问题。当 tibble 用于 时,MCMCglmm 不会抛出错误data,但不幸的是,其内部数据检查的编码方式不适用于 tibble。


推荐阅读