首页 > 解决方案 > R pivot_wider 所以重复行成为标题

问题描述

我正在尝试转换长数据,以便重复行值成为标题。数据如下所示:

# A tibble: 12 x 2
   x1       x2           
   <chr>    <chr>        
 1 Position 1            
 2 Name     Jon Ellis    
 3 Sex      m            
 4 Year     2017         
 5 Category Open         
 6 Time     06:37:27     
 7 Position 2            
 8 Name     Craig Holgate
 9 Sex      m            
10 Year     2015         
11 Category Open         
12 Time     06:43:45 

我希望我的重复行值(“职位”、“姓名”、“性别”、“年份”、“类别”、“时间”)成为标题,但尽管进行了多次尝试,但尚未弄清楚如何传播/枢轴实现这一目标的数据。感谢指点,谢谢。

structure(list(x1 = c("Position", "Name", "Sex", "Year", "Category", 
"Time", "Position", "Name", "Sex", "Year", "Category", "Time", 
"Position", "Name", "Sex", "Year", "Category", "Time", "Position", 
"Name", "Sex", "Year", "Category", "Time"), x2 = c("1", "Jon Ellis", 
"m", "2017", "Open", "06:37:27", "2", "Craig Holgate", "m", "2015", 
"Open", "06:43:45", "3", "Stuart Leaney", "m", "2018", "Open", 
"06:46:03", "4", "Craig Holgate", "m", "2013", "Open", "06:47:19"
)), row.names = c(NA, -24L), class = c("tbl_df", "tbl", "data.frame"
))

标签: rdplyr

解决方案


1) dplyr/tidyr Add a grouping column, row, convert from long to wide form, remove row and convert the column types.

library(dplyr)
library(tidyr)

DF %>%
  mutate(row = cumsum(x1 == "Position")) %>%
  pivot_wider(names_from = x1, values_from = x2) %>%
  select(-row) %>%
  type.convert(as.is = TRUE) 

giving:

# A tibble: 2 x 6
  Position Name          Sex    Year Category Time    
     <int> <chr>         <chr> <int> <chr>    <chr>   
1        1 Jon Ellis     m      2017 Open     06:37:27
2        2 Craig Holgate m      2015 Open     06:43:45

2) Base R Convert to Debian Control File format using string manipulation and read that with read.dcf creating a character matrix, convert to data frame and fix up the types.

txt <- with(DF, sub("Position", "\nPosition", sprintf("%s: %s", x1, x2)))
type.convert(as.data.frame(read.dcf(textConnection(txt))), as.is = TRUE)

giving:

  Position          Name Sex Year Category     Time
1        1     Jon Ellis   m 2017     Open 06:37:27
2        2 Craig Holgate   m 2015     Open 06:43:45

or expressed in terms of the Bizarro pipe which only needs base R:

DF ->.;
  with(., sub("Position", "\nPosition", sprintf("%s: %s", x1, x2))) ->.;
  textConnection(.) ->.;
  read.dcf(.) ->.;
  as.data.frame(.) ->.;
  type.convert(., as.is = TRUE)

Note

DF <- structure(list(x1 = c("Position", "Name", "Sex", "Year", "Category", 
"Time", "Position", "Name", "Sex", "Year", "Category", "Time"
), x2 = c("1", "Jon Ellis", "m", "2017", "Open", "06:37:27", 
"2", "Craig Holgate", "m", "2015", "Open", "06:43:45")), class = "data.frame", row.names = c(NA, 
-12L))

推荐阅读