首页 > 解决方案 > r中如何进行模糊匹配?

问题描述

我有一些如下数据,

plot    Tsoil_11 Tsoil_12 Tsoil_13 Tsoil_21 Tsoil_22  SWC_11   SWC_12  SWC_13  SWC_21  SWC_22 
11        1        2         3       4         5       6       7       8       9       10
11b      11       12        13      14       15       16       17      18     19       20
11c      21       22        23      24       25       26       27      28     29       30
12       31       32        33      34       35        36      37      38     39       40
13       41       42        43      44       45        46       47      48     49       50
21       51       52        53      54       55        56       57      58     59       60
21b      61       62        63      64       65        66       67      68     69       70
22       71       72        73      74       75        76       77      78     79       80
...

我想添加两列 Tsoil 和 SWC。理想的数据应该如下所示,

plot      Tsoil  SWC
11          1     6
11b        11    16
11c        21    26
12         32    12
13         43    48
21         54    59
21b        64    69
22         75    80
...

如何在 R 中实现这一点?</p>

标签: r

解决方案


我们可以使用pivot_longer并指定names_sepas _,然后filter基于“ind”列比较(列名的后缀)与“plot”中的数字值

library(dplyr)
library(tidyr)
library(stringr)
df1 %>%
  pivot_longer(cols = -plot, names_to = c( '.value', 'ind'), names_sep="_") %>%
  filter(str_remove(plot, "\\D+") == ind) %>%  
  #or use str_detect`
  #filter(str_detect(plot, ind)) %>%
  select(-ind)  
# A tibble: 8 x 3
# plot  Tsoil   SWC
#  <chr> <dbl> <dbl>
#1 11        1     6
#2 11b      11    16
#3 11c      21    26
#4 12       32    37
#5 13       43    48
#6 21       54    59
#7 21b      64    69
#8 22       75    80

数据

df1 <- structure(list(plot = c("11", "11b", "11c", "12", "13", "21", 
"21b", "22"), Tsoil_11 = c(1L, 11L, 21L, 31L, 41L, 51L, 61L, 
71L), Tsoil_12 = c(2L, 12L, 22L, 32L, 42L, 52L, 62L, 72L), Tsoil_13 = c(3L, 
13L, 23L, 33L, 43L, 53L, 63L, 73L), Tsoil_21 = c(4L, 14L, 24L, 
34L, 44L, 54L, 64L, 74L), Tsoil_22 = c(5L, 15L, 25L, 35L, 45L, 
55L, 65L, 75L), SWC_11 = c(6L, 16L, 26L, 36L, 46L, 56L, 66L, 
76L), SWC_12 = c(7L, 17L, 27L, 37L, 47L, 57L, 67L, 77L), SWC_13 = c(8L, 
18L, 28L, 38L, 48L, 58L, 68L, 78L), SWC_21 = c(9L, 19L, 29L, 
39L, 49L, 59L, 69L, 79L), SWC_22 = c(10L, 20L, 30L, 40L, 50L, 
60L, 70L, 80L)), class = "data.frame", row.names = c(NA, -8L))

推荐阅读