首页 > 解决方案 > 如果前一列包含值,则有条件地填充列?

问题描述

我想使用 dplyr::mutate 有条件地填充一列。新变量的一个级别应该对应于上一列中是否存在值,而另一个级别是“其他”条件。

我有一个数据框:

         group     piece      answer         agreement
        group1     A          noise       good 
        group1     A          silence     good
        group1     A          silence     good
        group1     B          silence     bad
        group1     B          loud_noise  bad
        group1     B          noise       bad
        group1     B          loud_noise  bad
        group1     B          noise       bad
        group2     C          silence     good
        group2     C          silence     good

我想创建一个按组分组的新变量,如果“错误”出现在“协议”中,那么值应该是“不一致”,但如果“协议”的所有值都是“好”,那么值应该是“持续的。'

        group     piece      answer     agreement   new_agreement
        group1     A          noise       good       bad
        group1     A          silence     good       bad
        group1     A          silence     good       bad
        group1     B          silence     bad        bad
        group1     B          loud_noise  bad        bad
        group1     B          noise       bad        bad
        group1     B          loud_noise  bad        bad
        group1     B          noise       bad        bad
        group2     C          silence     good       good
        group2     C          silence     good       good

但是 case_when 并没有完全做到这一点 - 它只是再次复制相同的变量:

   newdf <- df %>%
    group_by(group) %>%
    mutate(new_agreement = case_when(agreement == 'bad' ~
    "inconsistent", agreement =='good' ~ "consistent")) %>%
    as.data.frame()

标签: rdplyr

解决方案


只需添加any(agreement == 'bad')

df %>%
  group_by(group) %>%
  mutate(new_agreement = case_when(any(agreement == 'bad') ~"inconsistent",
                                   agreement =='good' ~ "consistent"))
    # A tibble: 10 x 5
    # Groups:   group [2]
       group  piece answer     agreement new_agreement
       <fct>  <fct> <fct>      <fct>     <chr>        
     1 group1 A     noise      good      inconsistent 
     2 group1 A     silence    good      inconsistent 
     3 group1 A     silence    good      inconsistent 
     4 group1 B     silence    bad       inconsistent 
     5 group1 B     loud_noise bad       inconsistent 
     6 group1 B     noise      bad       inconsistent 
     7 group1 B     loud_noise bad       inconsistent 
     8 group1 B     noise      bad       inconsistent 
     9 group2 C     silence    good      consistent   
    10 group2 C     silence    good      consistent   

你甚至可以if_else使用any

df %>% 
  group_by(group) %>% 
  mutate(new_agreement= if_else(any(agreement=="bad"), "inconsistent", "consistent") )

推荐阅读