首页 > 解决方案 > 将数据帧从长格式转为宽格式

问题描述

我有一个遵循以下长模式的数据框:

studentInfo <- data.frame(University=c("A","B","C","B","A","D"),StudentID = c("S1","S1","S2","S2","S3","S3"),Subject = c("Maths", "Science", "English", "Maths", "History", "English"))

studentInfo<-data.table(studentInfo,keep.rownames = "FALSE")



    University   StudentID     Subject
1   A            S1            Maths
2   B            S1            Science
3   C            S2            English
4   B            S2            Maths
5   A            S3            History
6   D            S3            English

dcast (studentInfo,StudentID ~ Subject, value.var = "Subject")

我得到以下信息:

 StudentID English History Maths Science
1:        S1    <NA>    <NA> Maths Science
2:        S2 English    <NA> Maths    <NA>
3:        S3 English History  <NA>    <NA>


我想得到以下信息:

    University  StudentID   S1     S3     S1      S2      S2      S3

1   A           S1          Maths                   
5   A           S3                 History              
2   B           S1                       Science            
4   B           S2                                Maths     
3   C           S2                                        English       
6   D           S3                                                English

我是 R 编码的新手。我正在准备一个数据集来运行 Heatmap/Oncoprint。我曾尝试使用 reshape2 和传播函数的 dcast。但无法获得工作流程下一步所需的格式。

谢谢

标签: rtranspose

解决方案


您可以创建具有行号的列,然后获取宽格式数据。

library(dplyr)

studentInfo %>%
    mutate(row = row_number()) %>%
    group_by(StudentID) %>%
    mutate(StudentID = paste(StudentID, row_number(), sep = "_")) %>%
    tidyr::pivot_wider(names_from = StudentID, values_from = Subject) %>%
    select(-row)

# A tibble: 6 x 7
#  University S1_1  S1_2    S2_1    S2_2  S3_1    S3_2   
#  <chr>      <chr> <chr>   <chr>   <chr> <chr>   <chr>  
#1 A          Maths NA      NA      NA    NA      NA     
#2 B          NA    Science NA      NA    NA      NA     
#3 C          NA    NA      English NA    NA      NA     
#4 B          NA    NA      NA      Maths NA      NA     
#5 A          NA    NA      NA      NA    History NA     
#6 D          NA    NA      NA      NA    NA      English

不建议使用具有相同列名的数据框。


推荐阅读