r - Using case_when() within mutate_at() to recode several columns with different types of NA
问题描述
Given the data:
df <- structure(list(cola = structure(c(5L, 9L, 6L, 2L, 7L, 10L, 3L,
8L, 1L, 4L), .Label = c("a", "b", "d", "g", "q", "r", "t", "w",
"x", "z"), class = "factor"), colb = c(156L, 8L, 6L, 100L, 49L,
31L, 189L, 77L, 154L, 171L), colc = c(0.207140279468149, 0.51990159181878,
0.402017514919862, 0.382948065642267, 0.488511856179684, 0.263168515404686,
0.38591041485779, 0.774066215148196, 0.763264901703224, 0.474355421960354
), cold = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("a",
"b"), class = "factor")), class = "data.frame", row.names = c(NA,
-10L))
df
# cola colb colc cold
# 1 q 156 0.2071403 a
# 2 x 8 0.5199016 b
# 3 r 6 0.4020175 a
# 4 b 100 0.3829481 b
# 5 t 49 0.4885119 a
# 6 z 31 0.2631685 b
# 7 d 189 0.3859104 a
# 8 w 77 0.7740662 b
# 9 a 154 0.7632649 a
# 10 g 171 0.4743554 b
If the value in colc
in a particular row is >= 0.5
, I would like to replace the contents of all the other cells in that row with NA, except for the contents of cold
for that row (which I would like to retain as it is).
I attempted this with dplyr::mutate_at()
and base::ifelse()
, and it works fine:
df %>% mutate_at(vars(-c(cold)), funs(ifelse(colc >= 0.5, NA, .)))
# cola colb colc cold
# 1 5 156 0.2071403 a
# 2 NA NA NA b
# 3 6 6 0.4020175 a
# 4 2 100 0.3829481 b
# 5 7 49 0.4885119 a
# 6 10 31 0.2631685 b
# 7 3 189 0.3859104 a
# 8 NA NA NA b
# 9 NA NA NA a
# 10 4 171 0.4743554 b
But I would like to do this with dplyr::case_when()
, as I might have more than one replacement condition to fulfill (e.g., replace with "foo"
if colc < 0.5 & colc >= 0.3
. But case_when()
does not appear to be playing nice:
df %>% mutate_at(vars(-c(cold)), funs(case_when(colc >= 0.5 ~ NA, TRUE ~ .)))
Error: must be a logical vector, not a factor object
Why is this happening and what can I do to fix it? I assume this is because I am trying to convert multiple columns with different data types to NA. I tried to look for a solution online, but I wasn't able to find one.
Edit: in specific, I would like to preserve the data types of the various columns as they are.
解决方案
library(dplyr)
df %>%
mutate_at(vars(-c(cold)), ~ case_when(colc >= 0.5 ~ `is.na<-`(., TRUE), TRUE ~ .))
# cola colb colc cold
# 1 q 156 0.2071403 a
# 2 <NA> NA NA b
# 3 r 6 0.4020175 a
# 4 b 100 0.3829481 b
# 5 t 49 0.4885119 a
# 6 z 31 0.2631685 b
# 7 d 189 0.3859104 a
# 8 <NA> NA NA b
# 9 <NA> NA NA a
# 10 g 171 0.4743554 b
描述
使用case_when
赋值NA
时,需要指定类型,NA
即NA_integer_
,,和。但是,同时转换多个列并且这些列具有不同的类型,因此您不能对所有列应用一个语句。理想情况下,可能存在诸如识别类型之类的东西,但到目前为止我还没有发现。这个方法有点棘手。我用来将输入向量转换为 NA,这些 NA 将与输入向量的类型相同。例如:NA_real_
NA_complex_
NA_character_
mutate_at
NA_guess
is.na()
x <- 1:5
is.na(x) <- TRUE ; x
# [1] NA NA NA NA NA
class(x)
# [1] "integer"
y <- letters[1:5]
is.na(y) <- TRUE ; y
# [1] NA NA NA NA NA
class(y)
# [1] "character"
推荐阅读
- php - url中的加密/解密字符串随机停止工作
- python - TypeError:字符串索引必须是整数 - Python3 Dictionary
- react-native - Fitbit OAuth2.0 redirect_uri 和原生反应
- python - .delete 在 tkinter 的类内部不起作用
- sql - 按月 SQL 计算 id 数
- c# - 如何使用 CLI 为测试项目运行多个 xunit 特征
- c++ - 如何防止DLL在C++中全局变量的D'tor之前被卸载?
- javascript - Angular FormGroup 无法分配给对象“[object Object]”的只读属性“status”
- excel - 创建重复单元格的相同代码号
- javascript - 未为密码和重新密码定义未捕获的 ReferenceError 函数