首页 > 解决方案 > 如何从一列创建多列,可能使用 dcast 或 tidyverse

问题描述

我正在学习 R 并试图弄清楚拆分列。我希望以宽格式从单个列中传播我的数据。有人告诉我使用 dcast,但我还没有找到最好的方法,并打算尝试通过 tidyverse 进行管道传输。

# sample data
> data <- data.frame(trimesterPeriod = c(first, second, third, PP, third, second, PP, first )
# dataframe 
  trimesterPeriod 
1 first
2 second
3 third
4 PP
5 third
6 second
7 PP
8 first

and i would it to look like this:

#dataframe
ID     first       second       third       PP
1        1            0           0         0
2        0            1           0         0 
3        0            0           1         0
4        0            0           0         1 
5        0            0           1         0 
6        0            1           0         0 
7        0            0           0         1
8        1            0           0         0 

我知道我将不得不从一个角色中更改 trimesterPeriod 数据,但从那时起我不知道该去哪里。我想这样做:

data.frame %>%
    mutate(rn = row_number(first, second, third, PP)) %>%
    spread(trimesterPeriod) %>%
    select(-rn)

但我不确定。任何建议都非常感谢!

标签: rtidyversemultiple-columnsdcast

解决方案


我们可以table使用base R

table(seq_len(nrow(data)), data$trimesterPeriod)

-输出

    first PP second third
  1     1  0      0     0
  2     0  0      1     0
  3     0  0      0     1
  4     0  1      0     0
  5     0  0      0     1
  6     0  0      1     0
  7     0  1      0     0
  8     1  0      0     0

或使用tidyverse

library(dplyr)
library(tidyr)
 data %>% 
   mutate(ID = row_number()) %>%
   pivot_wider(names_from = trimesterPeriod, 
     values_from = trimesterPeriod, values_fn = length, 
        values_fill = 0)

-输出

# A tibble: 8 × 5
     ID first second third    PP
  <int> <int>  <int> <int> <int>
1     1     1      0     0     0
2     2     0      1     0     0
3     3     0      0     1     0
4     4     0      0     0     1
5     5     0      0     1     0
6     6     0      1     0     0
7     7     0      0     0     1
8     8     1      0     0     0

数据

data <- structure(list(trimesterPeriod = c("first", "second", "third", 
"PP", "third", "second", "PP", "first")),
 class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8"))

推荐阅读