首页 > 解决方案 > 如何组合具有相同行名但唯一值的小标题行?

问题描述

我有以下小标题:

  Symbol `Annotated Term`                           
  <chr>  <chr>                                      
1 H2aj   chromatin silencing                        
2 Sfpq   histone H3 deacetylation                   
3 Ube2n  histone ubiquitination                     
4 Ube2n  positive regulation of histone modification

如何将重复的符号组合成一行,并将它们的注释术语组合在同一行中,以便上面的小标题看起来像:

  Symbol `Annotated Term`                           
  <chr>  <chr>                                      
1 H2aj   chromatin silencing                        
2 Sfpq   histone H3 deacetylation                   
3 Ube2n  histone ubiquitination, positive regulation of histone modification

任何帮助将不胜感激

标签: rdataframedplyr

解决方案


那要看。如果您希望将字符串连接并统一为单个字符串,那么

opt1 <- aggregate(`Annotated Term` ~ Symbol, data = dat, FUN = toString)
opt1
#   Symbol                                                      Annotated Term
# 1   H2aj                                                 chromatin silencing
# 2   Sfpq                                            histone H3 deacetylation
# 3  Ube2n histone ubiquitination, positive regulation of histone modification
str(opt1)
# 'data.frame': 3 obs. of  2 variables:
#  $ Symbol        : chr  "H2aj" "Sfpq" "Ube2n"
#  $ Annotated Term: chr  "chromatin silencing" "histone H3 deacetylation" "histone ubiquitination, positive regulation of histone modification"

如果您希望稍后轻松拆分它们,那么您的`Annotated Term`列需要保留为列表列,在这种情况下:

opt2 <- aggregate(`Annotated Term` ~ Symbol, data = dat, FUN = list)
opt2
#   Symbol                                                      Annotated Term
# 1   H2aj                                                 chromatin silencing
# 2   Sfpq                                            histone H3 deacetylation
# 3  Ube2n histone ubiquitination, positive regulation of histone modification
str(opt2)
# 'data.frame': 3 obs. of  2 variables:
#  $ Symbol        : chr  "H2aj" "Sfpq" "Ube2n"
#  $ Annotated Term:List of 3
#   ..$ : chr "chromatin silencing"
#   ..$ : chr "histone H3 deacetylation"
#   ..$ : chr  "histone ubiquitination" "positive regulation of histone modification"

(如果您还不了解列表列,并且相信您以后不会对取消组合它们感兴趣,那么我建议您opt1。)


数据

dat <- structure(list(Symbol = c("H2aj", "Sfpq", "Ube2n", "Ube2n"), `Annotated Term` = c("chromatin silencing", "histone H3 deacetylation",     "histone ubiquitination", "positive regulation of histone modification")), row.names = c(NA, -4L), class = "data.frame")

推荐阅读