首页 > 解决方案 > 如何在 r 中将多个 datframe 列作为一个组旋转

问题描述

我已经阅读了许多关于旋转/堆叠/重塑数据框的帖子,但没有一个能让我找到解决问题的方法。考虑以下df:

QID <- c('Q1', 'Q2', 'Q3')
Qtype <- c('Matrix', 'TE', 'DB')
Qtext <- c('A', 'B', 'C')
subQuestion.1.recode <- c('1', NA, NA)
subQuestion.1.description <- c('foo', NA, NA)
subQuestion.1.choice <- c('baa', NA, NA)
subQuestion.2.recode <- c('2', NA, NA)
subQuestion.2.description <- c('foo2', NA, NA)
subQuestion.2.choice <- c('baa2', NA, NA)
subQuestion.3.recode <- c('3', NA, NA)
subQuestion.3.description <- c('foo3', NA, NA)
subQuestion.3.choice <- c('baa3', NA, NA)

df <- data.frame(QID, Qtype, Qtext, subQuestion.1.recode, subQuestion.1.description, subQuestion.1.choice, subQuestion.2.recode, 
                 subQuestion.2.description, subQuestion.2.choice, subQuestion.3.recode, subQuestion.3.description, subQuestion.3.choice)

我想将“子问题”列从宽转长。我还想通过列名中的数字(与“subQuestion.recode”列中的数字相同)将“subQuestion”问题作为行保持在一起。所以输出看起来像这样:

QID <- c('Q1', 'Q1', 'Q1', 'Q2', 'Q3')
Qtype <- c('Matrix', 'Matrix', 'Matrix', 'TE', 'DB')
Qtext <- c('A', 'A', 'A', 'B', 'C')
subQuestion.recode <- c('1', '2', '3', NA, NA)
subQuestion.description <- c('foo', 'foo2', 'foo3', NA, NA)
subQuestion.choice<- c('baa', 'baa2', 'baa3', NA, NA)

df_out <- data.frame(QID, Qtype, Qtext, subQuestion.recode, subQuestion.description, subQuestion.choice)

提前感谢您的帮助!

标签: rdataframe

解决方案


试试这个。此代码中的关键是在name使用pivot_longer(). 此外,id 是跟踪行所必需的,可以通过变量中的数字获得。完成此操作后,您只需清理值并重塑以广泛保留您在问题中所指的那些 id 变量。这里有tidyverse方法的代码:

library(tidyverse)
#Code
df %>% pivot_longer(cols = -c(QID,Qtype)) %>%
  separate(name,c('V1','V2','V3'),sep = '\\.') %>%
  group_by(QID,Qtype) %>%
  fill(V2,.direction = 'up') %>%
  filter(!is.na(value)) %>%
  mutate(Var=ifelse(is.na(V3),V1,paste0(V1,'.',V3))) %>%
  select(-V3) %>% ungroup() %>%
  select(-V1) %>%
  pivot_wider(names_from = Var,values_from=value) %>%
  select(-V2) %>% fill(Qtext)

输出:

# A tibble: 5 x 6
  QID   Qtype  Qtext subQuestion.recode subQuestion.description subQuestion.choice
  <chr> <chr>  <chr> <chr>              <chr>                   <chr>             
1 Q1    Matrix A     1                  foo                     baa               
2 Q1    Matrix A     2                  foo2                    baa2              
3 Q1    Matrix A     3                  foo3                    baa3              
4 Q2    TE     B     NA                 NA                      NA                
5 Q3    DB     C     NA                 NA                      NA    

使用的一些数据:

#Data
df <- structure(list(QID = c("Q1", "Q2", "Q3"), Qtype = c("Matrix", 
"TE", "DB"), Qtext = c("A", "B", "C"), subQuestion.1.recode = c("1", 
NA, NA), subQuestion.1.description = c("foo", NA, NA), subQuestion.1.choice = c("baa", 
NA, NA), subQuestion.2.recode = c("2", NA, NA), subQuestion.2.description = c("foo2", 
NA, NA), subQuestion.2.choice = c("baa2", NA, NA), subQuestion.3.recode = c("3", 
NA, NA), subQuestion.3.description = c("foo3", NA, NA), subQuestion.3.choice = c("baa3", 
NA, NA)), class = "data.frame", row.names = c(NA, -3L))

推荐阅读