首页 > 解决方案 > 如何正确地将不等式或范围传递给 dplyr::case_when

问题描述

我没有成功尝试使用 dplyr::case_when 涉及间隔来创建给定变量的级别。

#准备样本数据

mtmodel <- lm(mpg ~ wt, data = mtcars)
mtcars$Low <- predict(mtmodel, newdata = mtcars, interval = "confidence")[,2]
mtcars$High <- predict(mtmodel, newdata = mtcars, interval = "confidence")[,3]
mtcars$Mean <- predict(mtmodel, newdata = mtcars, interval = "confidence")[,1]
new_mtcars<-gather(mtcars, "Variable", "value", Low:Mean)

#使用 dplyr::case_when 创建组

#does not work
library(tidyverse)
new_new_mtcars<-new_mtcars %>%
       mutate(grouping = case_when (
       min(new_mtcars$wt) <= new_mtcars$wt<= mean(new_mtcars$wt)+0.99 ~ "group1",
       new_mtcars$wt >= max(new_mtcars$wt) - 0.5  ~ "group2"))

#R 返回此错误消息并且未按预期完成工作

Error: unexpected '<=' in:
"           mutate(grouping = case_when (
               min(new_mtcars$wt) <= new_mtcars$wt<="

Error: unexpected ')' in "           
new_mtcars$wt >= max(new_mtcars$wt) - 0.5  ~ "group2")"

标签: rdplyrtidyversetidyr

解决方案


试试这个:

new_new_mtcars <- new_mtcars %>%
  mutate(grouping = case_when(
    min(wt) <= wt & wt <= mean(wt) + 0.99 ~ "group1",
    wt >= max(wt) - 0.5  ~ "group2"
  ))

在第一次引用之后,您不需要在管道内引用您的数据框。此外,min(wt) <= wt <= mean(wt) + 0.99总是会抛出一个错误,因为您总是需要每个运算符有两个变量,因此您需要指定wt <= mean(wt) + 0.99另一个条件。

一个例外是,如果你使用类似的东西between,你首先声明介于两者之间的变量,然后是下限和上限,如下所示:

new_new_mtcars <- new_mtcars %>%
  mutate(grouping = case_when(
    between(wt, min(wt), mean(wt) + 0.99) ~ "group1",
    wt >= max(wt) - 0.5  ~ "group2"
  ))

推荐阅读