首页 > 解决方案 > dplyr:对函数中的多个变量进行分组

问题描述

我想要两个分组变量列表。让我们说list1 = c("var2","var3","var4")list2 = c("var2","var3")

dta = data.frame(var1 = c(1:8),
                 var2 = c(rep("AA",4),rep("BB",4)),
                 var3 = rep(c("C","D"),4),
                 var4 = c(1,1,0,0,0,0,1,1))

dta %>% group_by(var2,var3,var4) %>% summarise(mv1 = mean(var1)) %>% 
  group_by(var2,var3) %>% summarise(mv1_2 = mean(mv1))

我怎样才能创建这样的功能

sample_fun = function(dta, list1, list2){

    dta %>% group_by(list1) %>% summarise(mv1 = mean(var1)) %>% 
  group_by(list2) %>% summarise(mv1_2 = mean(mv1))

}

标签: rdplyr

解决方案


这里有两种方法可以做到这一点 -

  1. dplyr溶液使用across
library(dplyr)
library(rlang)

sample_fun = function(dta, list1, list2){
  dta %>% 
    group_by(across(all_of(list1))) %>% 
    summarise(mv1 = mean(var1)) %>% 
    ungroup %>%
    group_by(across(all_of(list2))) %>% 
    summarise(mv1_2 = mean(mv1))
}

sample_fun(dta, list1, list2)
# var2  var3  mv1_2
#  <chr> <chr> <dbl>
#1 AA    C         2
#2 AA    D         3
#3 BB    C         6
#4 BB    D         7
  1. 使用非标准评估syms
sample_fun = function(dta, list1, list2){
  dta %>% 
    group_by(!!!syms(list1)) %>% 
    summarise(mv1 = mean(var1)) %>% 
    ungroup %>%
    group_by(!!!syms(all_of(list2))) %>% 
    summarise(mv1_2 = mean(mv1))
}

sample_fun(dta, list1, list2)

#  var2  var3  mv1_2
#  <chr> <chr> <dbl>
#1 AA    C         2
#2 AA    D         3
#3 BB    C         6
#4 BB    D         7


推荐阅读