首页 > 解决方案 > 基于一个条件跨多列创建二元列

问题描述

我已经导出了 Survey Monkey 数据,对于每个问题,它会为每个选项生成一个单独的列,如果受访者选择了这个响应,则用一个字符值填充它,否则它是NA(见下面的 df)。

我想基于跨多个列的相同条件创建一个新的二进制列。

diag <- structure(list(diag_stress_fracture = c(NA, "Stress 
fracture(s)", 
NA, NA, NA, NA), diag_disordered_eating = c(NA_character_, 
NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_), 
diag_asthma = c(NA, "Asthma", NA, NA, NA, NA), 
diag_low_bone_density = c(NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_), diag_acl_rupture = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_
), diag_concussion = c(NA, "Concussion", NA, NA, NA, NA), 
diag_depression_or_anxiety = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_
), diag_haemochromatosis = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_
), diag_hypothyroidism = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_
), diag_oligomenorrhea_or_amenorrhoea = c(NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_)), .Names = c("diag_stress_fracture", 
"diag_disordered_eating", 
"diag_asthma", "diag_low_bone_density", "diag_acl_rupture", 
"diag_concussion", 
"diag_depression_or_anxiety", "diag_haemochromatosis", 
"diag_hypothyroidism", 
"diag_oligomenorrhea_or_amenorrhoea"), row.names = c(NA, 6L), class 
= "data.frame")`

本质上,我想知道参与者是否有诊断,不管它是什么。我可以使用以下代码得到我想要的结果(...上面感兴趣的列在哪里,但我已经截断了这个例子):

diag <- diag %>%
mutate(diag.yn = ifelse(!is.na(diag_stress_fracture) |
!is.na(diag_disordered_eating) | 
!is.na(diag_asthma) | ... , 1, 0)

但是,鉴于我想针对多个问题执行此操作,这显然非常笨拙且耗时。有没有办法使用列位置来做到这一点,例如在我的大数据集中这些是 38:47?

标签: rif-statementdplyr

解决方案


推荐阅读