r - How do l extract all genes in gene_symbol into a new column that have the same start and end in r
问题描述
l have a dataframe like, l want to have another column name call Gene, where it looks through and pick all genes in gene symbols that have the same fragment or start and end into a new column call Genes as seen below
chr start end Fragments CK BB FP i.start i.end gene_name gene_symbol
1: 1 710000 715000 143 0.2662 1 0.0138 91421 762886 ENSG00000225880 LINC00115
2: 1 710000 715000 143 0.2662 1 0.0138 91421 762886 ENSG00000240453 RP11-206L10.10
3: 1 710000 715000 143 0.2662 1 0.0138 676386 762886 ENSG00000228327 RP11-206L10.2
4: 1 710000 715000 143 0.2662 1 0.0138 714172 740255 ENSG00000237491 RP11-206L10.9
5: 1 720000 725000 145 0.0000 0 0.0000 91421 762886 ENSG00000225880 LINC00115
6: 1 720000 725000 145 0.0000 0 0.0000 91421 762886 ENSG00000240453 RP11-206L10.10
l want it to be like this
chr start end Fragments CK BB FP i.start i.end Genes
1: 1 710000 715000 143 0.2662 1 0.0138 91421 762886 LINC00115,RP11-206L10.10,RP11-206L10.2,RP11-206L10.9
2: 1 720000 725000 145 0.0000 0 0.0000 91421 762886 LINC00115,RP11-206L10.10
解决方案
We can do a group by paste
library(data.table)
dt[, .(Genes = toString(gene_symbol)),
.(chr, start, end, Fragments, CK, BB, i.start, i.end)]
推荐阅读
- oracle - Oracle SQL 表在 Create 语句上显示缺少右括号为什么?
- python - 使用 matplotlib python 为决策树分类器绘制 2 个以上的特征
- selenium-chromedriver - 使用 Chrome 驱动程序的 C# Selenium 代理身份验证
- python - 在其他功能中导入模块
- swift - 旋转光照环境内容 ARKit 和 SceneKit
- android - 安装引用库返回数据 utm_source=(not%20set)&utm_medium=(not%20set)?我怎样才能得到网址中提到的确切参数。?
- json - 带有变量的 JSON 对象到 PostgreSQL
- javascript - FlipClock .getTime() 不起作用?不更新返回值
- javascript - 我应该为子类的继承行为编写单元测试吗?
- python - Flask + uwsgi:在语法和导入错误上避免内部服务器错误/“找不到 python 应用程序”