r - 如何根据其他变量标记/删除特定重复项
问题描述
我想知道如何根据列中的特定值删除特定行,但这些删除取决于子组中的其他变量。如果"aja"与"ase"分组在一起,我想删除它。如果子组同时具有“ase”或“aja”,则脚本应不理会它。我已经指出脚本应该删除哪些。
id somedata subgroup
1 1 "aja" okay
2 1 "aja" okay
3 2 "ase" okay
4 2 "aja" delete
5 3 "aja" delete
6 3 "ase" okay
7 4 "aja" okay
8 4 "aja" okay
9 5 "ase" okay
10 5 "ase" okay
11 6 "aja" delete
12 6 "ase" okay
Code to generate the data
id = c(1,1,2,2,3,3,4,4,5,5,6,6)
somedata = c("aja","aja","ase","aja","aja","ase","aja","aja","ase","ase","aja","ase")
subgroup = c("okay","okay","okay","DELETE","DELETE","okay","okay","okay","okay","okay","DELETE","okay")
proov = data.frame(cbind(id,somedata,subgroup))
解决方案
你可以做一个简单的过滤,即
library(dplyr)
proov %>%
group_by(id) %>%
filter(!(n_distinct(somedata) > 1 & somedata == 'aja'))
这使,
# A tibble: 9 x 3 # Groups: id [6] id somedata subgroup <fct> <fct> <fct> 1 1 aja okay 2 1 aja okay 3 2 ase okay 4 3 ase okay 5 4 aja okay 6 4 aja okay 7 5 ase okay 8 5 ase okay 9 6 ase okay
推荐阅读
- html - 保存和加载动态创建的可拖动元素的位置(jQuery-UI)
- amazon-web-services - EMR 命令运行程序如何提交作业
- sql - Google Data Studio (BigQuery) - 创建过滤器以按最新时间选择
- itext - ITextSharp v5.5.13.0 XMLWorker 土耳其语字符问题
- c# - 从拆分字符串c#动态创建变量
- featuretools - 升级后运行 Featuretools dfs 时出现“IndexError: Too many levels”
- asch - 注册新组时输入变量的含义是什么?
- integration - 查找所有集成了 CL 的分支
- java - 如何在循环java中添加更多变量
- java - 针对 XML 模式(XSD 文件)的通用 XML 文件验证器