r - 验证两个数据框中的值
问题描述
我有两个数据框,大约。现在有 100 万条记录我正在尝试检查 Uniq_ID 是否存在于 df2 中,而 df1 中是否存在 city = mum。然后用 1 或 0 对 df2 进行变异以判断真或假。
df1 <- data.frame(ID =c("DEV2962","KTN2252","ANA2719","ITI2624","DEV2698","HRT2921","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
city=c("del","mum","mum","pun","bang","mum","triv","vish","mum","mum","bang","vish","mum","kol","noi","mum"))
df2 <- data.frame(Uniq_ID =c("DEV2962","KTN2252","ANA2719","H7236","DEV2692","HRT2921","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2831","ERV2951","KTN2542","ANA2813","ITI2210"),
city=c("del","mum","bho","pun","mum","chen","mum","vish","mum","mum","bang","mum","mum","kol","noi","mum"))
解决方案
在这种情况下,我们可以使用基数 R。这是否有效:
> df2$ID_not_in_df1 <- ifelse(!df2$Uniq_ID %in% df1$ID & df2$city == 'mum', 1 ,0)
> df2
Uniq_ID city ID_not_in_df1
1 DEV2962 del 0
2 KTN2252 mum 0
3 ANA2719 bho 0
4 H7236 pun 0
5 DEV2692 mum 1
6 HRT2921 chen 0
7 mum 0
8 KTN2624 vish 0
9 ANA2548 mum 0
10 ITI2535 mum 0
11 DEV2732 bang 0
12 HRT2831 mum 1
13 ERV2951 mum 0
14 KTN2542 kol 0
15 ANA2813 noi 0
16 ITI2210 mum 0
>
推荐阅读
- angular - 以角度动态加载 twitter-feed
- c - pthread_cond_wait 是否锁定互斥锁和虚假唤醒
- javascript - Gatsby.js:如何将所有 *.js 文件迁移到 *ts?
- docker - 如何使用 ASP.Net Core 从 docker 容器连接到 DB2?
- regression - 将平面拟合到 3D 中的许多点
- reactjs - FlatList 没有在组件中呈现
- angular - Angular 异步管道与路由器相结合在开发构建和生产构建中表现出不同的行为
- java - 无法获得连接,DataSource 无效:“java.sql.SQLException:找不到适合 dataSource 的驱动程序”
- java - 为什么 Hibernate 4 可能比 JdbcTemplate 更快?
- c - 我认为我超出了此过程的可用内存。有人可以看看并验证吗?