首页 > 解决方案 > 使长数据变宽并折叠行

问题描述

我正在尝试将数据框从长格式转换为宽格式。目前df的设置如下:

dput(head(df,10))
structure(list(TECH_ID = c("14050154", "14050154", "13835650", 
"13835650", "13469601", "13469601", "13782883", "13782883", "12342837", 
"12342837"), MNSCU_QUES = c("What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?", "What language did you learn to speak first?", 
"Which language do you speak most often at home?"), MNSCU_RESP = c("English and another language", 
"Another", "English only", "English", "English only", "English", 
"English and another language", "English", "English only", "English"
)), row.names = c(NA, 10L), class = "data.frame")

我正在尝试设置数据框,使其如下所示:

在此处输入图像描述

我一直在尝试在这里使用此代码:

df_wide <- dcast(df, TECH_ID+MNSCU_RESP~MNSCU_QUES)

但生成的数据框如下所示:

代码:

dput(head(df_wide,10))
structure(list(TECH_ID = c("00007179", "00007179", "00008201", 
"00008201", "00020900", "00020900", "00021757", "00021757", "00031227", 
"00031227"), MNSCU_RESP = c("English", "English only", "English", 
"English only", "English", "English only", "English", "English only", 
"English", "English only"), `What language did you learn to speak first?` = c(0L, 
1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L), `Which language do you speak most often at home?` = c(1L, 
0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L)), row.names = c(NA, 10L), class = "data.frame")

视觉的: 在此处输入图像描述

标签: rdataframedplyrtidyrplyr

解决方案


library(reshape2)

df <- structure(list(TECH_ID = c("14050154", "14050154", "13835650", 
                           "13835650", "13469601", "13469601", "13782883", "13782883", "12342837", 
                           "12342837"), MNSCU_QUES = c("What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?", "What language did you learn to speak first?", 
                                                       "Which language do you speak most often at home?"), MNSCU_RESP = c("English and another language", 
                                                                                                                          "Another", "English only", "English", "English only", "English", 
                                                                                                                          "English and another language", "English", "English only", "English"
                                                       )), row.names = c(NA, 10L), class = "data.frame")

df_wide <- reshape2::dcast(df, TECH_ID~MNSCU_QUES, value.var = "MNSCU_RESP")

> df_wide
   TECH_ID What language did you learn to speak first? Which language do you speak most often at home?
1 12342837                                English only                                         English
2 13469601                                English only                                         English
3 13782883                English and another language                                         English
4 13835650                                English only                                         English
5 14050154                English and another language                                         Another

推荐阅读