r - 从多列而不是单列聚合数据
问题描述
我有一个巨大的基因表达数据集,200k 个变量(行)和 170 个 obs(列)。下面是前几行/列
Gene Transcript_ID V1 V2 V3 V4 V5
1 ENSG00000000003.14 ENST00000612152.4 0 6 0 3 15
2 ENSG00000000003.14 ENST00000373020.8 4 0 5 0 0
3 ENSG00000000003.14 ENST00000614008.4 0 0 0 0 0
4 ENSG00000000003.14 ENST00000496771.5 0 3 0 0 7
我正在尝试将所有数据分组以按基因表达。我正在利用现有语法通过一些元数据(基因 ID)对单个数据列进行分组,并试图让它为所有 170 个 obs 运行。语法如下,应该是一个非常简单的修复。
transcript_grouped <-aggregate(res$V1, by=list(Category=res$Gene), FUN=sum)
V1 是列名或观察/数据列,Res 是整个数据集,gene 是我希望数据分组的类别。此语法适用于 V1,但我需要为所有列运行此语法。
我尝试为所有列名创建一个变量,甚至手动粘贴它们。
dataColumns<- dataColumns = c("V1","V2","V3","V4","V5","V6","V7","V8","V9","V10","V11","V12","V13","V14","V15","V16","V17","V18","V19","V20","V21","V22","V23","V24","V25","V26","V27","V28","V29","V30","V31","V32","V33","V34","V35","V36","V37","V38","V39","V40","V41","V42","V43","V44","V45","V46","V47","V48","V49","V50","V51","V52","V53","V54","V55","V56","V57","V58","V59","V60","V61","V62","V63","V64","V65","V66","V67","V68","V69","V70","V71","V72","V73","V74","V75","V76","V77","V78","V79","V80","V81","V82","V83","V84","V85","V86","V87","V88","V89","V90","V91","V92","V93","V94","V95","V96","V97","V98","V99","V100","V101","V102","V103","V104","V105","V106","V107","V108","V109","V110","V111","V112","V113","V114","V115","V116","V117","V118","V119","V120","V121","V122","V123","V124","V125","V126","V127","V128","V129","V130","V131","V132","V133","V134","V135","V136","V137","V138","V139","V140","V141","V142","V143","V144","V145","V146","V147","V148","V149","V150","V151","V152","V153","V154","V155","V156","V157","V158","V159","V160","V161","V162","V163","V164","V165","V166")
trans_grouped <-aggregate(res$dataColumns, by=list(Category=res$Gene), FUN=sum)
aggregate.data.frame(as.data.frame(x), ...) 中的错误:没有要聚合的行
请问如何循环这个以包含所有列?
解决方案
这个解决方案怎么样dplyr
:
library(dplyr)
df %>%
group_by(Gene) %>%
summarise(across(starts_with("V"), ~sum(.)))
# A tibble: 2 x 4
Gene V1 V2 V3
* <chr> <dbl> <dbl> <dbl>
1 A 4 4 7
2 B 6 4 3
测试数据:
df <- data.frame(
Gene = c("A", "B", "A", "B"),
V1 = c(1,2,3,4),
V2 = c(2,2,2,2),
V3 = c(4,2,3,1)
)
推荐阅读
- typo3 - Typo3 为 CommandController 命令定义 storagePid
- java - 如何制作自定义的导航抽屉菜单?
- ruby - 如何从外部停止 udp_server_loop
- excel - 如何在循环代码中粘贴值和更改背景颜色?
- css - Joomla 3.9:设置为分隔符的菜单项不可点击以显示子菜单项
- javascript - 用于自定义检查的 HTML5 约束验证
- typescript - 基于另一个泛型参数的类型对象参数
- c++ - 如何为采用 stl 容器迭代器的函数提供函数签名?
- jquery - 如何在 jQuery 中过滤数据表值?
- python - 在Python中添加年份到日期(dmy)列