首页 > 解决方案 > R:计算一个因子在data.frame中与group by结合的比例

问题描述

我想summarise使用 group by 对数据框进行几次计算。输入数据:

dat <- data.frame (ID = c(1:10),
                   var1 = as.factor(c("A","B","A","A","B","B","B","C","A","B")),
                   Var2 = as.factor(c("low","medium","low","low","medium","high","high","high","high","high")))

现在我想对 var1 进行分组,计算 ID 并计算 var2 = high 的比例。我的输出应该是这样的:

  var1 total prop_high
1    A     4      0.25
2    B     5      0.60
3    C     1      1.00

到目前为止,我得到了以下代码,但我陷入了比例计算

dat2 <- dat %>% 
  group_by(var1) %>%
  summarise(total = n(),
            prop_high = )

标签: rdplyr

解决方案


您可以采用mean逻辑值来获得比例。

library(dplyr)

dat %>% 
  group_by(var1) %>%
  summarise(total = n(),
            prop_high = mean(Var2 == 'high'))
            #Same as
            #prop_high = sum(Var2 == 'high')/n())

#   var1  total prop_high
#  <fct> <int>     <dbl>
#1 A         4      0.25
#2 B         5      0.6 
#3 C         1      1   

推荐阅读