首页 > 解决方案 > Using purrr to create several new variables based on values of existing variables

问题描述

EDIT: added sample df

I have a 3 item checklist (options a, b, c) in which participants can choose as many responses as apply to them. In my data, these responses are stored in three binary response options (q4___a, q4___b, q4___c). I have this same data across four different time points (1, 2, 3, 4), so my variables are coded like this:

q4_1___a
q4_1___b
q4_1___c
q4_2___a
q4_2___b

etc., where q4 is the stem, the integer is the time at which the data was collected, and the letter is the response option. Here is a sample dataframe:

df <- data.frame(
 q4_1___a = rbinom(10, 1, .5),
 q4_1___b = rbinom(10, 1, .5),
 q4_1___c = rbinom(10, 1, .5),
 q4_2___a = rbinom(10, 1, .5),
 q4_2___b = rbinom(10, 1, .5),
 q4_2___c = rbinom(10, 1, .5),
 q4_3___a = rbinom(10, 1, .5),
 q4_3___b = rbinom(10, 1, .5),
 q4_3___c = rbinom(10, 1, .5),
 q4_4___a = rbinom(10, 1, .5),
 q4_4___b = rbinom(10, 1, .5),
 q4_4___c = rbinom(10, 1, .5)
)

I need to create "group" variables that combine the results of the three different binary response variables at each time point. I can do this at time point 1 using the following code:

df%>%
 mutate(q4_1_group = case_when(
  q4_1___a == 1 & q4_1___b == 0 & q4_1___c == 0 ~ "a",
  q4_1___a == 0 & q4_1___b == 1 & q4_1___c == 0 ~ "b",
  q4_1___a == 0 & q4_1___b == 0 & q4_1___c == 1 ~ "c",
  q4_1___a == 1 & q4_1___b == 1 & q4_1___c == 0 ~ "ab",
  q4_1___a == 1 & q4_1___b == 0 & q4_1___c == 1 ~ "ac",
  q4_1___a == 0 & q4_1___b == 1 & q4_1___c == 1 ~ "bc",
  q4_1___a == 1 & q4_1___b == 1 & q4_1___c == 1 ~ "abc"
 ))

I'm having trouble figuring out where to go from here to iterate over this across all four time points. Essentially, I need to change the 1's in all of the variable names to 2's, 3's, and 4's, so that:

df%>%
 mutate(q4_[i]_group = case_when(
  q4_[i]___a == 1 & q4_[i]___b == 0 & q4_[i]___c == 0 ~ "a",
  q4_[i]___a == 0 & q4_[i]___b == 1 & q4_[i]___c == 0 ~ "b",
  q4_[i]___a == 0 & q4_[i]___b == 0 & q4_[i]___c == 1 ~ "c",
  q4_[i]___a == 1 & q4_[i]___b == 1 & q4_[i]___c == 0 ~ "ab",
  q4_[i]___a == 1 & q4_[i]___b == 0 & q4_[i]___c == 1 ~ "ac",
  q4_[i]___a == 0 & q4_[i]___b == 1 & q4_[i]___c == 1 ~ "bc",
  q4_[i]___a == 1 & q4_[i]___b == 1 & q4_[i]___c == 1 ~ "abc"
 ))

where [i] corresponds to something like c(1:4). I feel like there must be a straightforward way to do this using purrr, but I'm struggling to figure it out. Any help would be greatly appreciated!

标签: riterationpurrr

解决方案


我们可以创建一个 keyval 数据集,然后进行连接

library(tidyverse)
keydat <- data.frame(a = c(1, 0, 0, 1, 1, 0, 1),
                     b = c(0, 1, 0, 1, 0, 1, 1), 
                     c = c(0, 0, 1, 0, 1, 1, 1),
                     group = c("a", "b", "c", "ab", "ac", "bc", "abc"), 
            stringsAsFactors = FALSE)
nm1 <- unique(sub("__.*", "", names(df)))
split.default(df, as.numeric(gsub("^q\\d+_|__.*$", "", names(df)))) %>%
     map(~ .x %>%
              left_join(keydat, by = setNames(letters[1:3], names(.x)))) %>%
     bind_cols %>%
     rename_at(vars(matches('group')), ~paste0(nm1, '_group'))

推荐阅读