首页 > 解决方案 > 如何通过 3 个变量值类型添加新变量?

问题描述

我有一个 data.frame 具有不同级别的工作 - 经理,主管,SelfEmployed,官员,高度专业的员工,低技术工人,非技术工人。我想添加带有变量类的新列,其中高级工人的值为 1,中产阶级工人的值为 2,低级工人的值为 1。

我有一个 data.frame 像:

head(df)
# Job                  
# Manager               
# Supervisor            
# Low skilled worker    
# Low skilled worker    
# Unskilled worker      
# Manager  
# Official
# Official             

Data.frame 会像:

head(df)
# Job                  Class
# Manager               1
# Supervisor            1
# Low skilled worker    3
# Low skilled worker    3 
# Unskilled worker      3
# Manager               1
# Official              2
# Official              2

标签: r

解决方案


使用base R

df$Class[df$Job %in% c("Manager", "Supervisor")] <- 1
df$Class[df$Job == "Official"] <- 2
df$Class[df$Job %in% c("Low skilled worker", "Unskilled worker")] <- 3

使用dplyr

df %>% 
  mutate(Class = 
           case_when(
             Job %in% c("Manager", "Supervisor") ~ 1,
             Job == "Official" ~ 2,
             Job %in% c("Low skilled worker", "Unskilled worker") ~ 3
           ))

使用data.table

setDT(df)

df[, Class := 0][Job %in% c("Manager", "Supervisor"), Class := 1][Job == "Official", Class := 2][Job %in% c("Low skilled worker", "Unskilled worker"), Class := 3]

给我们:

  Job                Class
  <chr>              <dbl>
1 Manager                1
2 Supervisor             1
3 Low skilled worker     3
4 Low skilled worker     3
5 Unskilled worker       3
6 Manager                1
7 Official               2
8 Official               2

数据:

structure(list(Job = c("Manager", "Supervisor", "Low skilled worker", 
"Low skilled worker", "Unskilled worker", "Manager", "Official", 
"Official")), row.names = c(NA, -8L), class = c("tbl_df", "tbl", 
"data.frame"))

推荐阅读