首页 > 解决方案 > 从另一列值设置类别

问题描述

我想创建一个可能有 3 个值的内兹列类别,低、中和高。这些值将取决于另一列。我在下面尝试了这个,但它只适用于中和高。低不采取。

admission$category[admission$gre == 0 | admission$gre <= 440]= "low"


admission$category[admission$gre == 440 | admission$gre <= 580] = "Medium"

admission$category[admission$gre == 580  | admission$gre >= 580] = "High"

admission$category=as.factor(admission$category)

错误:
admission$category[admission$gre == 0 | admission$gre <= 440]= "low"
警告信息:
[<-.factor( *tmp*, admission$gre == 0 | admission$gre <= 440, : 无效因子级别,生成 NA

str du df类别:因子w / 2个级别“高”,“中”:2 1 1 1 2 1 2 2 2 1 ...

标签: r

解决方案


您有错误,因为类别是一个因素。

set.seed(100)
admission = data.frame(category=sample(letters[1:4],100,replace=TRUE),
gre = sample(1:600,100))
admission$category = as.character(admission$category)
admission$category[admission$gre <= 440]= "low"
admission$category[admission$gre > 440 & admission$gre <= 580] = "Medium"
admission$category[admission$gre > 580] = "High"
table(admission$category)

  High    low Medium 
     3     69     28 

或者您可以简单地执行以下操作:

admission$category = cut(admission$gre,breaks=c(0,440,580,+Inf),
labels=c("low","Medium","High"))
table(admission$category)
low Medium   High 
69     28      3

推荐阅读