首页 > 解决方案 > How to fill a column based on a condition using sum() for matches in r

问题描述

I have struggles filling a column based on a condition. Maybe my approach is not in the right direction. I don't know. My conditions are as follow:

So far I did the following but I see that this is not quite accurate since my new vector is not created from the rows but the entire column, and it still doesn't work.

set.seed(123)
df_letters <- data.frame(basket1 = sample(letters[1:3], 5,  replace = TRUE, prob = c(0.85,0.10,0.5)),
                        basket2 = sample(letters[1:3], 5,  replace = TRUE, prob = c(0.10,0.85,0.5)),
                        basket3 = sample(letters[1:3], 5,  replace = TRUE, prob=c(0.5,0.10,0.85)),
                        stringsAsFactors = FALSE)


df_letters %>% mutate(match = ifelse(sum(as.character(as.vector(df_letters))  == "c")==2, "C", 
                                    ifelse((sum(as.character(as.vector(df_letters))  == "b")==2) & (sum(as.character(as.vector(df_letters))  == "a")==1) ,"B", NA  )))

My desired output is:

> df_letters
  basket1 basket2 basket3 match
1       a       b       b     B
2       c       b       c     C
3       a       c       a  <NA>
4       c       b       c     C
5       b       b       c  <NA>

Many thanks in advance!

标签: rif-statementdplyr

解决方案


一种dplyr选择可能是:

df_letters %>%
 mutate(match = case_when(rowSums(select(., starts_with("basket")) == "b") == 2 & rowSums(select(., starts_with("basket")) == "a") == 1 ~ "B",
                          rowSums(select(., starts_with("basket")) == "c") == 2 ~ "C",
                          TRUE ~ NA_character_))

  basket1 basket2 basket3 match
1       a       b       b     B
2       c       b       c     C
3       a       c       a  <NA>
4       c       b       c     C
5       b       b       c  <NA>

推荐阅读