首页 > 解决方案 > R str_replace_all 在替换字符串中添加括号或括号

问题描述

我有一个 data.frame 像

                         Family                   Genus               Specie
JN692281.1.1537 Pseudomonadaceae             Azotobacter       Ambiguous_taxa
HM128723.1.1454 Pseudomonadaceae             Pseudomonas uncultured bacterium
KX177686.1.1460  Sneathiellaceae                AT-s3-44 uncultured bacterium
KR912339.1.1546 Desulfobulbaceae Candidatus Electrothrix       Ambiguous_taxa
GU179625.1.1501 Pseudomonadaceae             Pseudomonas       Ambiguous_taxa

所以我想用类似 (Genus:data)-Unknown 的 Genus 列数据替换 Specie 列中存在Ambiguous_taxametagenomeuncultured bacterium的所有分类单元:

                         Family                   Genus               Specie
JN692281.1.1537 Pseudomonadaceae             Azotobacter       (Genus:Azotobacter)-Unknown

所以我使用这段代码:

#extract the tax_table with Phyloseq 
DF <- as.data.frame(tax_table(Phyloseq_obj))

#Replace with 
Unwanted <- c("Ambiguous_taxa|metagenome|uncultured bacterium")

DF$Specie <- str_replace_all(string = DF$Specie, pattern = Unwanted, (paste("(Genus:", DF$Genus, sep = "")))

我得到:

                          Family                   Genus                         Specie
JN692281.1.1537 Pseudomonadaceae             Azotobacter             (Genus:Azotobacter
HM128723.1.1454 Pseudomonadaceae             Pseudomonas             (Genus:Pseudomonas
KX177686.1.1460  Sneathiellaceae                AT-s3-44                (Genus:AT-s3-44
KR912339.1.1546 Desulfobulbaceae Candidatus Electrothrix (Genus:Candidatus Electrothrix
GU179625.1.1501 Pseudomonadaceae             Pseudomonas             (Genus:Pseudomonas

我想关闭括号并添加 Unknown: (Genus:Azotobacter)-Unknown

                          Family                   Genus                         Specie
JN692281.1.1537 Pseudomonadaceae             Azotobacter             (Genus:Azotobacter)-Unknown 

谢谢 !!!

标签: r

解决方案


这是使用基本 Rsprintf函数的一种方法:

DF <- structure(list(code = c("JN692281.1.1537", "HM128723.1.1454", 
"KX177686.1.1460", "KR912339.1.1546", "GU179625.1.1501"), Family = c("Pseudomonadaceae", 
"Pseudomonadaceae", "Sneathiellaceae", "Desulfobulbaceae", "Pseudomonadaceae"
), Genus = c("Azotobacter", "Pseudomonas", "AT-s3-44", "Candidatus Electrothrix", 
"Pseudomonas"), Specie = c("Ambiguous_taxa", "uncultured bacterium", 
"uncultured bacterium", "Ambiguous_taxa", "Ambiguous_taxa")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                    -5L))
Unwanted <- c('Ambiguous_taxa', 'metagenome', 'uncultured bacterium')

DF$Specie[DF$Specie %in% Unwanted] <- sprintf('(Genus:%s)-Unknown', DF$Genus[DF$Specie %in% Unwanted]) 

             code           Family                   Genus                                  Specie
1 JN692281.1.1537 Pseudomonadaceae             Azotobacter             (Genus:Azotobacter)-Unknown
2 HM128723.1.1454 Pseudomonadaceae             Pseudomonas             (Genus:Pseudomonas)-Unknown
3 KX177686.1.1460  Sneathiellaceae                AT-s3-44                (Genus:AT-s3-44)-Unknown
4 KR912339.1.1546 Desulfobulbaceae Candidatus Electrothrix (Genus:Candidatus Electrothrix)-Unknown
5 GU179625.1.1501 Pseudomonadaceae             Pseudomonas             (Genus:Pseudomonas)-Unknown

推荐阅读