首页 > 解决方案 > 根据 r 中的均值对变量的水平进行分组

问题描述

我想根据每个组的平均价格对我的级别进行分组,这是正确的方法吗?

ames.train.c <- ames.train.c %>%
  group_by(Neighborhood) %>%
   mutate(Neighborhood.Cat = ifelse(mean(price) < 140000, "A", 
            ifelse(mean(price) < 200000, "B",
            ifelse(mean(price) < 260000, "C",
            ifelse(mean(price) < 300000, "D",
            ifelse(mean(price) < 340000, "E"))))))

the data can be found here: https://d3c33hcgiwev3.cloudfront.net/_fc6ea3b3b1af3f4fd9afb752e85d4299_ames_train.Rdata?Expires=1633651200&Signature=P7oxFR0IzJ2UP73GI0aJVua67DxUlvoWYhXdQwHf2CZefX2J~0KAxosAWMHtHxcKH81l87~uRBS0FqBb2MUA2UCQUWCg3ldR9mBQypVTq4ofv3wwOq3-r7d6hw1zM72FYfX2oRYgsKzTl5ucb9oQVUa~jBOW1tF3sTtL0h-ykr4_&Key-Pair-Id=APKAJLTNE6QMUY6HBC5A

标签: rdplyr

解决方案


我认为这种方法可能会对您有所帮助

library(dplyr)

cut_breaks <- c(0,140000,200000,260000,300000,340000)
cut_labels <- c("A","B","C","D","E")

  ames.train.c %>%
  group_by(Neighborhood) %>%
  mutate(Neighborhood.Cat = cut(mean(price),cut_breaks,labels = cut_labels))

推荐阅读