首页 > 解决方案 > 在 R 中嵌套 ifelse 以跨多个列进行变异

问题描述

为简单起见,我在 R 中处理来自实验的数据,其中包含 2 个问题和 3 个条件。每个参与者被分配到 3 个条件中的一个,每个参与者都回答两个问题。

当前对数据进行了组织,以便每一列包含一个条件下一个问题的答案:

Respondent.ID <- c("ID1", "ID2", "ID3", "ID4", "ID5", "ID6", "ID7", "ID8", "ID9")
Q1.CondA <- c("Correct", "Incorrect", "I don't know", "", "", "", "", "", "")
Q1.CondB <- c("", "", "", "Incorrect", "Correct", "I don't know", "", "", "")
Q1.CondC <- c("", "", "", "", "", "", "I don't know", "Correct", "Incorrect")
Q2.CondA <- c("Incorrect", "Correct", "I don't know", "", "", "", "", "", "")
Q2.CondB <- c("", "", "", "I don't know", "Correct", "Incorrect", "", "", "")
Q2.CondC <- c("", "", "", "", "", "", "Correct", "Incorrect", "I don't know")

current<- data.frame(Respondent.ID, Q1.CondA, Q1.CondB, Q1.CondC, Q2.CondA, Q2.CondB, Q2.CondC)

我想重新组织数据,以便一列显示分配的条件,一列显示所有条件的问题 1 的答案,以及 1 列显示所有条件的问题 2 的答案:

Respondent.ID <- c("ID1", "ID2", "ID3", "ID4", "ID5", "ID6", "ID7", "ID8", "ID9")
Condition <- c("A", "A", "A", "B", "B", "B", "C", "C", "C")
Q1 <- c("Correct", "Incorrect", "I don't know", "Incorrect", "Correct", "I don't know", "I don't know", 
"Correct", "Incorrect")
Q2 <- c("Incorrect", "Correct", "I don't know", "I don't know", "Correct", "Incorrect", "Correct", "Incorrect", "I don't know")
desired<- data.frame(Respondent.ID, Condition, Q1, Q2)

我可以通过嵌套 If 语句在 Excel 中执行此操作(如果单元格非空白,则复制单元格,ELSE IF 下一个单元格非空白,然后复制下一个单元格,ELSE IF ...)。但如果可能的话,我宁愿在 R 中做所有事情。

Excel中的拖动复制功能意味着我可以输入这个单元格级别的嵌套If语句一次,然后它会自动更改为其他单元格。我的实际数据集有 40 个问题、8 个条件和 800 名参与者,因此无法手动执行此操作。

在 R 中,我尝试将 mutate() 与嵌套的 ifelse() 语句结合起来:

dplyr::mutate(current, Condition = ifelse(current$Q1.CondA != ' ', 'A', 
       ifelse(current$Q1.CondB !=' ', 'B',
              ifelse(current$Q1.CondC !=' ', 'C', ' '))))

但它只是查找列中的第一个元素,找到一个非空白,然后返回一个新的填充有“A”的“条件”列。

有没有办法让它在 R 中工作以从我当前的数据帧到我想要的数据帧?

标签: rif-statementdplyrdata-wrangling

解决方案


另一种方法是首先使用 对表进行透视tidyr,因此除了 ID 之外的所有列都被收集到一个列中。

library(tidyr)
current %>% 
  tidyr::pivot_longer(cols = -Respondent.ID)

name然后可以将生成的列分成 2 列,使用".Cond"

current %>% 
  tidyr::pivot_longer(cols = -Respondent.ID) %>% 
  tidyr::separate(name, c("question_id", "condition"), sep = ".Cond")

删除空字符串并旋转更宽将产生所需的输出

current %>% 
  tidyr::pivot_longer(cols = -Respondent.ID) %>% 
  tidyr::separate(name, c("question_id", "condition"), sep = ".Cond") %>% 
  dplyr::filter(value != "") %>% 
  tidyr::pivot_wider(names_from = question_id)

推荐阅读