首页 > 解决方案 > 折叠成单行而不连接

问题描述

我的DF如下:

                 df <- structure(list(RID = c(1L, 1L, 2L, 2L, 3L, 3L), 
                 Sex = c("FEMALE", "FEMALE", "MALE", "MALE", "FEMALE", "FEMALE"),
                 Race = c("White","White", "Hispanic", "Hispanic", "Black", "Black"),
                 TIME = c("Break Fast", "Break Fast", "Lunch", "Lunch", "Dinner", "Dinner"),
                 Sugar = c("Normal", "Normal", "Abnormal", "Abnormal", "Satisfactory", 
                 "Satisfactory"), 
                 Test_A = c(90L,"","" , 157L,"" , 129L),
                 Test_B = c("",90L , 157L,"", 129L,"" )),
                 class = "data.frame", row.names = c(NA, -6L))

所需的输出是:

                 Requd_df <- structure(list(RID = c(1L, 2L,3L), 
                 Sex = c("FEMALE", "MALE", "FEMALE"),
                 Race = c("White", "Hispanic","Black"),
                 TIME = c("Break Fast",  "Lunch",   "Dinner"),
                 Sugar = c("Normal",  "Abnormal",  "Satisfactory"), 
                 Test_A = c(90L, 157L, 129L),
                 Test_B = c(90L , 157L, 129L)),
                 class = "data.frame", row.names = c(NA, -3L))

我的代码如下:

                 setDT(df)

                 df1 <-  df[, lapply(.SD, paste0, collapse=""), by= RID]

我的代码连接列的每个元素——RID、Sex、Race、Time、Sugar。需要在没有连接的情况下折叠请帮助

标签: rdata.table

解决方案


包括其他变量by-

library(data.table)

setDT(df)
df[, lapply(.SD, paste0, collapse=""), .(RID, Sex, Race, TIME, Sugar)]

#   RID    Sex     Race       TIME        Sugar Test_A Test_B
#1:   1 FEMALE    White Break Fast       Normal     90     90
#2:   2   MALE Hispanic      Lunch     Abnormal    157    157
#3:   3 FEMALE    Black     Dinner Satisfactory    129    129

推荐阅读