首页 > 解决方案 > 如何根据数据框中的常用值将嵌套数据框列表转换为计数矩阵

问题描述

我有一长串基因。我在下面添加了一个玩具示例。

dput(list1) 的输出

list(ENDOSS = structure(list(ENDOSS = c("CDKN1C", "SOX6", "TGFB2"
)), row.names = c(NA, -3L), class = "data.frame"), ENDOSSSD = structure(list(
    ENDOSSSD = c("CDKN1C", "SOX6", "TGFB2")), row.names = c(NA, 
-3L), class = "data.frame"), GASTRIN = structure(list(GASTRIN = c("IKBKB", 
"KIT", "SERPINE1")), row.names = c(NA, -3L), class = "data.frame"), 
    METCC = structure(list(METCC = character(0)), row.names = character(0), class = "data.frame"))

玩具清单看起来像这样

list1
    ENDOSS
         "CDKN1C", "SOX6", "TGFB2" 
    ENDOSSSD
         "CDKN1C", "SOX6", "TGFB2"
    GASTRIN
          "IKBKB", "KIT", "SERPINE1"
    METCC

我想将此列表转换为计数矩阵。根据示例,输出应如下所示。

             CDKN1C  IKBKB  KIT SERPINE1 SOX6   TGFB2 
    ENDOSS     1       0     0     0       1      1

    ENDOSSSD   1       0     0     0       1      1

    GASTRIN    0       1     1     1       0      0

    METCC      0       0     0     0       0      0

任何帮助,将不胜感激。谢谢。

标签: rlistmatrix

解决方案


我们可以mtabulate在将列转换为每个list元素中的向量后使用

library(qdapTools)
mtabulate(lapply(list1, unlist))
         CDKN1C IKBKB KIT SERPINE1 SOX6 TGFB2
ENDOSS        1     0   0        0    1     1
ENDOSSSD      1     0   0        0    1     1
GASTRIN       0     1   1        1    0     0
METCC         0     0   0        0    0     0

推荐阅读