首页 > 解决方案 > 用 sub 和 if 在 R 中替换第一个数字字符

问题描述

我想替换以 5 到 1 开头的这 3 个字符数字。我尝试在有条件的情况下使用 sub,但它失败了

DO_连接:

    DTNASC   AGE
1   3031997  520
2   9022017  0
3   13071933 83
4   6022002  515
5   2061966  50
6   28121946 70
7   4121955  61
8   3101943  73
9   6022017  20
10  14012017 0
11  20071931 8

if((nchar(DO_concatenated$AGE) == 3)&(funcaoidade(DO_concatenated$AGE) == 5)){
  DO_concatenated$IDADE = sub(pattern = 5, replacement = 1, DO_concatenated$AGE) 
}

如果它有效,输出将是这样的:

    DTNASC   AGE
1   3031997  120
2   9022017  0
3   13071933 83
4   6022002  115
5   2061966  50
6   28121946 70
7   4121955  61
8   3101943  73
9   6022017  20
10  14012017 0
11  20071931 8

我之前这样做是为了删除以 4 开头的变量,代码如下:

if((nchar(DO_concatenated$IDADE) == 3)&(funcaoidade(DO_concatenated$IDADE) == 4)){
  DO_concatenated$IDADE = sub(pattern = 4, replacement = "", DO_concatenated$IDADE) 
}

它奏效了!

"funcaoidade" 查找数字的第一个字符

funcaoidade = function(x){
  substr(x, start = 1, stop = 1)
}

那么,有什么区别呢?提前致谢!

标签: rif-statementreplaceconditional-statements

解决方案


这是您可以使用 stringr 包执行此操作的一种方法;

library(dplyr)
library(stringr)

data <-
  data.frame(
    DTNASC = c(3031997, 9022017, 13071933, 6022002, 2061966, 28121946, 4121955, 
               3101943, 6022017, 14012017, 20071931),
    AGE = c(520, 0, 83, 515, 50, 70, 61, 73, 20, 0, 8)
  )

data %>%
  mutate(# Replacement of Age
    # To convert it into character to make it easier
    AGE = as.character(AGE),
    # Here 5 is the character we are checking in first character
    # str_sub(AGE, 1, 1) -> Checks first character
    # nchar(AGE) == 3 -> Checks if the length of AGE is 3
    # str_replace(AGE, "5", "1") -> Replaces 5 with 1
    # as.numeric() -> To convert to a number
    AGE = ifelse(str_sub(AGE, 1, 1) == "5" & nchar(AGE) == 3,
                 as.numeric(str_replace(AGE, "5", "1")),as.numeric(AGE)),

    # Replacement of DTNASC
    # To convert it into character to make it easier
    DTNASC = as.character(DTNASC),
    # Here 4 is the character we are checking in first character
    # str_sub(DTNASC, 1, 1) -> Checks first character
    # nchar(DTNASC) == 7 -> Checks if the length of DTNASC is 7
    # str_replace(DTNASC, "4", "") -> Replaces 4 with null
    # as.numeric() -> To convert to a number
    DTNASC = ifelse(str_sub(DTNASC, 1, 1) == "4" & nchar(DTNASC) == 7,
                 as.numeric(str_replace(DTNASC, "4", "")),as.numeric(DTNASC)))

# DTNASC AGE
# 3031997 120
# 9022017   0
# 13071933  83
# 6022002 115
# 2061966  50
# 28121946  70
# 121955  61
# 3101943  73
# 6022017  20
# 14012017   0
# 20071931   8

推荐阅读