首页 > 解决方案 > 重新编码数据框 R 中的多列

问题描述

我正在尝试将许多列的 Y 转换为 1,这些列可能会发生变化(例如,它可能会上升到 x20)。

下面是数据示例以及预期输出。

Data <- tibble(Date = seq.Date(as.Date('2019-01-01'),as.Date('2019-01-08'), by = "day"), 
               x1 = c("Y","","","Y","Y","","Y","Y"),
               x2 = c("","Y","Y","Y","Y","","Y","Y"))


Data_output <- tibble(Date = seq.Date(as.Date('2019-01-01'),as.Date('2019-01-08'), by = "day"), 
               x1 = c(1,0,0,1,1,0,1,1),
               x2 = c(0,1,1,1,1,0,1,1))

标签: rreplacedplyrrecode

解决方案


dplyr

Data %>% 
  mutate_at(vars(contains("x")),~case_when(.=="Y" ~1,
                                           .=="" ~0))

或者正如@akrun 所建议的那样:

Data %>% 
  mutate_at(vars(contains("x")), ~as.integer(.=="Y"))  

结果:

# A tibble: 8 x 3
  Date          x1    x2
  <date>     <dbl> <dbl>
1 2019-01-01     1     0
2 2019-01-02     0     1
3 2019-01-03     0     1
4 2019-01-04     1     1
5 2019-01-05     1     1
6 2019-01-06     0     0
7 2019-01-07     1     1
8 2019-01-08     1     1

推荐阅读