首页 > 解决方案 > gsub 不替换列中的名称

问题描述

使用 R,我的数据框 capstone3 与列 Certificate...HQA 具有以下级别:

levels(capstone3$Certificate...HQA)

 [1] "CUM LAUDE"                     "DIPLOM"                       
 [3] "DOCTORATE"                     "GRADUATE DIPLOMA"             
 [5] "HIGHEST HONS"                  "HONOURS (DISTINCTION)"        
 [7] "HONOURS (HIGHEST DISTINCTION)" "HONS"                         
 [9] "HONS I"                        "HONS II"                      
[11] "HONS II LOWER"                 "HONS II UPPER"                
[13] "HONS III"                      "HONS UNCLASSIFIED"            
[15] "HONS WITH MERIT"               "MAGNA CUM LAUDE"              
[17] "MASTER'S DEGREE"               "OTHER HONS"                   
[19] "PASS DEGREE"                   "PASS WITH CREDIT"             
[21] "PASS WITH DISTINCTION"         "PASS WITH HIGH MERIT"         
[23] "PASS WITH MERIT"               "SUMMA CUM LAUDE" 

我编写了一个代码,通过将级别 [7] 替换为级别 [9]、级别 [6] 替换为级别 [12] 等来减少级别数量:

capstone3$Certificate...HQA <- as.factor(capstone3$Certificate...HQA)

capstone3$Certificate...HQA <- gsub("HONOURS (HIGHEST DISTINCTION)","HONS I", capstone3$Certificate...HQA)

capstone3$Certificate...HQA <- gsub("HONOURS (DISTINCTION)","HONS II UPPER", capstone3$Certificate...HQA)

capstone3$Certificate...HQA <- gsub("HONS WITH MERIT","HONS II LOWER", capstone3$Certificate...HQA)

但是上面的 gsub 代码没有替换列中的名称,有人可以指出我的代码的问题吗?

标签: rregexgsub

解决方案


括号()是正则表达式中用于创建组的特殊字符。如果您有文字括号,则需要使用\\

gsub("HONOURS \\(HIGHEST DISTINCTION\\)","HONS I", capstone3$Certificate...HQA)

或作为@ManuelBickel:使用fixed = TRUE模式是字符串将按原样匹配。

gsub("HONOURS (HIGHEST DISTINCTION)","HONS I", capstone3$Certificate...HQA, fixed = TRUE)

推荐阅读