首页 > 解决方案 > 检查名称是否在电子邮件列中重复

问题描述

我有一个如下所示的数据框,现在我想检查 @ 之前的名称是否重复,如果重复,则将新列突变为(1,0)以获得 TRUE 和 FALSE

df <- data.frame(ID =c("DEV2962","KTN2252","ANA2719","ITI2624","DEV2698","HRT2921","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
                 city=c("del","mum","nav","pun","bang","chen","triv","vish","del","mum","bang","vish","bhop","kol","noi","gurg"),
                 email = c("akash.dev@gmail.com","rahul.singh@gmail.com","salman.abbas@gmail.com","ram.lal@gmail.com","ram.lal@gmail.com","prabal.garg@gmail.com","sanu.ali@gmail.com","kunal.singh@gmail.com","lakhan.tomar@gmail.com","praveen.thakur@gmail.com","sarman.ali@gmail.com","zuber.khan@gmail.com","giriraj.singh@gmail.com","lokesh.sharma@gmail.com","pooja.pawar@gmail.com","nikita.sharma@gmail.com"),
                 name= c("dev,akash","singh,rahul","abbas,salman","lal,ram","singh,nkunj","garg,prabal","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))

我也有一个相同的旧数据框,以检查旧数据框中是否存在邮件 ID(如果存在)检查所有记录是否相同,例如(名称、城市、ID)

我曾尝试使用 string_detect 但它不起作用。

输出会像

在此处输入图像描述

标签: r

解决方案


这应该可以解决问题的第一部分:

library(stringr)
df %>% 
  mutate(first =str_extract(email, "[^\\@]+"),
         duplicate = as.numeric(duplicated(first))) 

第一行提取直到 的所有内容,@第二行查找 的任何重复观测值first


推荐阅读