r - Subset dataframe based on indicator variables
问题描述
Using R, how can one subset a dataframe that has indicator variables, based on a vector of columns?
# Dataframe with 3 indicator variables - a, b, and c
df = data.frame(a = c(1, 0), b = c(1, 1), c = c(0, 1))
subset.iv = function (df, cols) {
# ???
}
# Subset rows that match a or c (i.e. a=1 or c=1):
subset.iv(df, c('a', 'c'))
# Subset rows that match b (i.e. b=1):
subset.iv(df, c('b'))
I know how to subset a dataframe based on a known/static condition (e.g. df[df$a == 1 | df$b == 1,]
).
But in this case the problem is that I can't write the condition expression since I don't know the number of columns to check for, or the columns themselves.
Also, subset
doesn't allow passing a custom function where I might be able to parse the vector and check for columns.
解决方案
假设你的指标是肯定的,零是否定的,那么这样的事情可能会奏效
subset.iv = function (df, cols) {
df[rowSums(df[cols])>0, ]
}
给予
> subset.iv(df, c('a', 'c'))
a b c
1 1 1 0
2 0 1 1
> subset.iv(df, c('b'))
a b c
1 1 1 0
2 0 1 1
> subset.iv(df, c('c'))
a b c
2 0 1 1
推荐阅读
- java - 使用 JavaMail Api 在电子邮件中插入图像
- ios - 如何以正确的方式在 SwiftUI 中使用颜色?(特别适用于具有 Light 和 DarkMode 的应用程序)
- c# - 如果浏览器打开,请检查 Selenium
- docker - docker compose 将两个服务组合为组
- python - 如何仅将整数(如 25.0)的浮点数更改为整数并将浮点数(如 25.9)保留为 python 中的浮点数
- node.js - npm run deploy 不工作,显示错误
- python - 如何在scrapy selenium中间件中向蜘蛛返回多个响应而不是只返回一次
- python - ValueError:无法将字符串转换为浮点数:'86,5484466552734'
- python - 用返回覆盖保存模型 Django 保存方法
- date - 在 Excel 表单上设置日期格式