首页 > 解决方案 > 过滤一列中少于 10 个号码的电话号码 -R

问题描述

我有一个数据库,我正在尝试查找具有以下条件的标识符通过从第一个字母中获取前 3 个字母,从 Phone1 的最后一个字母和最后 4 个数字中获取最后 3 个字母,如果 Phone1 有空单元格,我们需要获取 Phone2 的最后 4 个数字。我只想让列有phone1或phone2的行有10位数字

first <- c("apple", "grape", "rose", "Jasmine", "Apricots", "mango", "banana", "Blueberries")
Last <- c("Jackfruit", "Kiwi", "Mulberry", "rabbit ", "pine", "Limes", "", "Nectarine")
Phone1<-c("1234567890", "(456)7089123", "1230789456", "", "999999", " ", "1112223334", "887775")
Phone2<-c("1234737650", "", "15", "8888888888", "99", "3336783245 ", "", "") 
df <- data.frame(first, Last ,Phone1,Phone2)    

预期输出:

在此处输入图像描述

标签: r

解决方案


使用您展示的内容(前 3 个Last)代替您所说的内容(最后 3 个Last):

phone_coalesce <- function(...) {
  dots <- list(...)
  if (!length(dots)) return(character(0))
  dots <- lapply(dots, function(s) gsub("[^0-9]", "", s))
  dots <- lapply(dots, function(s) ifelse(nchar(s) == 10L, s, NA_character_))
  out <- rep(NA_character_, length(dots[[1]]))
  for (dot in dots) {
    isna <- is.na(out)
    if (!any(isna)) break
    out[isna] <- dot[isna]
  }
  out[is.na(out)] <- ""
  substr(out, 7, 10)
}

paste0(substr(df$first, 1, 3),
       substr(df$Last, 1, 3),
       with(df, phone_coalesce(Phone1, Phone2)))
# [1] "appJac7890" "graKiw9123" "rosMul9456" "Jasrab8888" "Aprpin"    
# [6] "manLim3245" "ban3334"    "BluNec"    

用于填写上述模板的文字代码。

tempph <- with(df, phone_coalesce(Phone1, Phone2))
tempph
# [1] "7890" "9123" "9456" "8888" ""     "3245" "3334" ""    
df$new <- paste0(substr(df$first, 1, 3), substr(df$Last, 1, 3), tempph)
df[nzchar(tempph),]
#     first      Last       Phone1      Phone2        new
# 1   apple Jackfruit   1234567890  1234737650 appJac7890
# 2   grape      Kiwi (456)7089123             graKiw9123
# 3    rose  Mulberry   1230789456          15 rosMul9456
# 4 Jasmine   rabbit                8888888888 Jasrab8888
# 6   mango     Limes              3336783245  manLim3245
# 7  banana             1112223334                ban3334

推荐阅读