r - 检查名称是否在电子邮件列中重复
问题描述
我有一个如下所示的数据框,现在我想检查 @ 之前的名称是否重复,如果重复,则将新列突变为(1,0)以获得 TRUE 和 FALSE
df <- data.frame(ID =c("DEV2962","KTN2252","ANA2719","ITI2624","DEV2698","HRT2921","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
city=c("del","mum","nav","pun","bang","chen","triv","vish","del","mum","bang","vish","bhop","kol","noi","gurg"),
email = c("akash.dev@gmail.com","rahul.singh@gmail.com","salman.abbas@gmail.com","ram.lal@gmail.com","ram.lal@gmail.com","prabal.garg@gmail.com","sanu.ali@gmail.com","kunal.singh@gmail.com","lakhan.tomar@gmail.com","praveen.thakur@gmail.com","sarman.ali@gmail.com","zuber.khan@gmail.com","giriraj.singh@gmail.com","lokesh.sharma@gmail.com","pooja.pawar@gmail.com","nikita.sharma@gmail.com"),
name= c("dev,akash","singh,rahul","abbas,salman","lal,ram","singh,nkunj","garg,prabal","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))
我也有一个相同的旧数据框,以检查旧数据框中是否存在邮件 ID(如果存在)检查所有记录是否相同,例如(名称、城市、ID)
我曾尝试使用 string_detect 但它不起作用。
输出会像
解决方案
这应该可以解决问题的第一部分:
library(stringr)
df %>%
mutate(first =str_extract(email, "[^\\@]+"),
duplicate = as.numeric(duplicated(first)))
第一行提取直到 的所有内容,@
第二行查找 的任何重复观测值first
。
推荐阅读
- shell - 使用程序名称文件作为输入获取文件的前 5 行(Unix)
- azure - 在 Function App 中连接依赖注入时获取配置
- php - 如何使用 AdSense API 向网站用户显示保存的报告?
- android - Android:如何为 EditText 指定 2 种样式?
- memory-management - 内存不足错误。如何定位?怎么解决?
- android - Uri 和 getData 在上传到 Firebase 期间为空对象
- sql-server - 真正的零安全比较
- ios - 在子视图控制器之间切换
- python - 找不到解决方法:恰好需要 1 个参数(给定 2 个)
- javascript - 如何保存文本框中的输入,然后移至下一页?(Javascript/HTML)