首页 > 解决方案 > How can I merge row with similar value in column, to get unique value in another column in R

问题描述

Hei,

I would like to merge rows under conditions, when the column NameSize has the same value (example: Chaetoceros), I would like to get the combined value of the columns: August 2018 and August 2019 into respective column by keeping unique value only.

I have this case for many others species in the column NameSize (several more Chaetoceros and others).

Example of part of my data set

Example of part of my data set

Results expected

Results wanted

Example of part of my data with dput():

structure(list(class = c("Bacillariophyceae", "Bacillariophyceae", 
"Bacillariophyceae", "Bacillariophyceae", "Bacillariophyceae", 
"Bacillariophyceae"), NameSize = c("Attheya longicornis", "Bacterosira bathyomphala", 
"Chaetoceros", "Chaetoceros", "Chaetoceros cf. atlanticus", "Chaetoceros cf. borealis"
), `August 2018` = c("SICE3", "P1,SICE3", "P1,PICE1,SICE3", "PICE1", 
"SICE3", "P1,PICE1,SICE3"), `August 2019` = c("P6,P7,Sice4", 
"P6", "", "P2", "", "")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

results wanted:

structure(list(class = c("Bacillariophyceae", "Bacillariophyceae", 
"Bacillariophyceae", "Bacillariophyceae", "Bacillariophyceae"
), NameSize = c("Attheya longicornis", "Bacterosira bathyomphala", 
"Chaetoceros", "Chaetoceros cf. atlanticus", "Chaetoceros cf. borealis"
), `Aug-18` = c("SICE3", "P1,SICE3", "P1,PICE1,SICE3", "SICE3", 
"P1,PICE1,SICE3"), `Aug-19` = c("P6,P7,Sice4", "P6", "P2", NA, 
NA)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L), spec = structure(list(cols = list(class = structure(list(), class = c("collector_character", 
"collector")), NameSize = structure(list(), class = c("collector_character", 
"collector")), `Aug-18` = structure(list(), class = c("collector_character", 
"collector")), `Aug-19` = structure(list(), class = c("collector_character", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1L), class = "col_spec"))

标签: rmergerowunique

解决方案


p <- structure(list(class = c("Bacillariophyceae", "Bacillariophyceae", 
                              "Bacillariophyceae", "Bacillariophyceae", "Bacillariophyceae", 
                              "Bacillariophyceae"), NameSize = c("Attheya longicornis", "Bacterosira bathyomphala", 
                                                                 "Chaetoceros", "Chaetoceros", "Chaetoceros cf. atlanticus", "Chaetoceros cf. borealis"
                              ), `August 2018` = c("SICE3", "P1,SICE3", "P1,PICE1,SICE3", "PICE1", 
                                                   "SICE3", "P1,PICE1,SICE3"), `August 2019` = c("P6,P7,Sice4", 
                                                                      

我使用图书馆“pdlyr”做到了这一点

p1 <- p %>% group_by(class, NameSize) %>%
  summarise(`August 2018` = first(`August 2018`),
            `August 2019` = last(`August 2019`)) %>% 
  as.data.frame()

希望这对你有用

  class             NameSize                   `August 2018`  `August 2019`
  <chr>             <chr>                      <chr>          <chr>        
1 Bacillariophyceae Attheya longicornis        SICE3          "P6,P7,Sice4"
2 Bacillariophyceae Bacterosira bathyomphala   P1,SICE3       "P6"         
3 Bacillariophyceae Chaetoceros                P1,PICE1,SICE3 "P2"         
4 Bacillariophyceae Chaetoceros cf. atlanticus SICE3          ""           
5 Bacillariophyceae Chaetoceros cf. borealis   P1,PICE1,SICE3 ""   

推荐阅读