首页 > 解决方案 > r regex 删除括号前后的所有字符

问题描述

是否可以从列名中删除除第一列之外的括号前后的所有字符?

Input df

df <- data.frame(CCLE_ID = c("AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA" ), `A1BG (1)` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_ ), `A1CF (29974)` = c(0.0100738474498, 0.00419071223405, 0.161435671978, 0.00437517766114, 0.00494118028018), `A2M (2)` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `A2ML1 (144568)` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `A4GALT (53947)` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `A4GNT (51146)` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `AAAS (8086)` = c(0.0261000247231, 0.00339180018571, 0.0124666557843, 0.00222981468535, 0.00236993307389 ), `AACS (65985)` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `AADAC (13)` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), check.names = FALSE) 



Output df

df1 <- data.frame(CCLE_ID = c("AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA", "AUTONOMIC_GANGLIA" ), `1` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_ ), `29974` = c(0.0100738474498, 0.00419071223405, 0.161435671978, 0.00437517766114, 0.00494118028018), `2` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `144568` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `53947` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `51146` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `8086` = c(0.0261000247231, 0.00339180018571, 0.0124666557843, 0.00222981468535, 0.00236993307389 ), `65985` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `13` = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), check.names = FALSE) 

然后使用另一个数据框 df2 更改 df1 的列名(旧为新)

df2 <- data.frame(oldname = c("CCLE_ID", "1", "29974", "2", "144568", "53947", "51146", "8086", "65985", "13"), newname = c("CCLE_ID", "ESN", "PSA", "TGI", "PICJ", "TMNS", "IUJE", "UED", "PUQD", "STGW" ), check.names = FALSE)

先感谢您。

标签: rdataframetidyverse

解决方案


试试这个

df %>% 
  rename_with(
    ~ str_match(.x, "\\((\\d+)\\)$")[,2],
    .cols = -CCLE_ID)

结果是

> df %>%
+   rename_with(
+     ~ str_match(.x, "\\((\\d+)\\)$")[,2],
+     .cols = -CCLE_ID)
            CCLE_ID  1       29974  2 144568 53947 51146        8086 65985 13
1 AUTONOMIC_GANGLIA NA 0.010073847 NA     NA    NA    NA 0.026100025    NA NA
2 AUTONOMIC_GANGLIA NA 0.004190712 NA     NA    NA    NA 0.003391800    NA NA
3 AUTONOMIC_GANGLIA NA 0.161435672 NA     NA    NA    NA 0.012466656    NA NA
4 AUTONOMIC_GANGLIA NA 0.004375178 NA     NA    NA    NA 0.002229815    NA NA
5 AUTONOMIC_GANGLIA NA 0.004941180 NA     NA    NA    NA 0.002369933    NA NA

推荐阅读