r - Creating variables from list objects in R
问题描述
I'm trying to create a binary set of variables that uses data across multiple columns. I have a dataset where I'm trying to create a binary variable where any column with a specific name will be indexed for a certain value. I'll use iris as an example dataset.
Let's say I want to create a variable where any column with the string "Sepal" and any row in those columns with the values of 5.1, 3.0, and 4.7 will become "Class A" while values with 3.1, 5.0, and 5.4 will be "Class B". So let's look at the first few entries of iris
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
The first 3 rows should then be under "Class A" While rows 4-6 will be under "Class B". I tried writing this code to do that
mutate(iris, Class = if_else(
vars(contains("Sepal")), any_vars(. %in% c(5.1,3.0, 4.7))), "Class A",
ifelse(vars(contains("Sepal")), any_vars(. %in% c(3.1, 5.0, 5.4))), "Class B",NA)
and received the error
Error: `condition` must be a logical vector, not a `quosures/list` object
So I've realized I need lapply
here, but I'm not even sure where to begin to write this because I'm not sure how to represent the entire part of selecting columns with "Sepal" in the name and also include the specific values in those rows as one list object to provide to lapply
This is clearly the wrong syntax
lapply(vars(contains("Sepal")), any_vars(. %in% c(5.1,3.0, 4.7)))
Examples using case_when
will also be accepted as answers.
解决方案
If you want to do this using dplyr
, you can use rowwise
with new c_across
:
library(dplyr)
iris %>%
rowwise() %>%
mutate(Class = case_when(
any(c_across(contains("Sepal")) %in% c(5.1,3.0, 4.7)) ~ 'Class A',
any(c_across(contains("Sepal")) %in% c(3.1,5.0,5.4)) ~ 'Class B')) %>%
head
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species Class
# <dbl> <dbl> <dbl> <dbl> <fct> <chr>
#1 5.1 3.5 1.4 0.2 setosa Class A
#2 4.9 3 1.4 0.2 setosa Class A
#3 4.7 3.2 1.3 0.2 setosa Class A
#4 4.6 3.1 1.5 0.2 setosa Class B
#5 5 3.6 1.4 0.2 setosa Class B
#6 5.4 3.9 1.7 0.4 setosa Class B
However, note that using %in%
on numerical values is not accurate. If interested you may read Why are these numbers not equal?
推荐阅读
- amazon-web-services - 如何让 lambda 监听多个 cloudwatch 日志组?
- reactjs - Material ui:如何使用重复键“makeStyles”?
- javascript - discord.js v12 10 秒间隔后的角色编辑
- node.js - 如何为不同的集合/模型使用相同的快速路由和相同的 mongo 模式?
- amazon-web-services - 如何在应用期间使用 terraform 创建 ec2 后运行脚本?
- angular - 创建可重复使用的角垫自动完成
- javascript - 查找总计等于 N 的所有值组合(javascript)
- c# - 通过值更改触发事件 (
- python - 我想将一个表单的实例传递给另一个表单。我怎样才能做到这一点?
- python - 检查文本是否存在 2 个或多个用括号括起来的字符或数字,至少第一个字符为大写