首页 > 解决方案 > 循环或使用应用函数使用自定义函数生成新变量

问题描述

我编写了一个函数来根据另一个变量的级别创建一个新变量。我想重复使用该函数来制作一堆新变量,然后在不同的回归中使用它们。我想知道最容易理解和最灵活的方法是使用 for 循环或某种应用函数。

library(tidyverse)
library(estimatr)
treatment_maker <- function(magnitude_change_level) {
  mtcars %>% mutate("treatment_magnitude_{{magnitude_change_level}}":=ifelse(cyl>magnitude_change_level,1,0))
}
#Now I want to loop over the magnitude change levels from 5 to 7, rather than copy and pasting
mtcars <- treatment_maker(5)
mtcars <- treatment_maker(6)
#Then run a seperate regression on each variable
lm_robust(mpg ~ treatment_magnitude_5, data = mtcars) %>% tidy()
lm_robust(mpg ~ treatment_magnitude_6, data = mtcars) %>% tidy()

标签: rtidyeval

解决方案


我们可以map用来循环幅度级别,使用“treatment_maker”创建二进制列,应用lm_robust,获取tidy输出并将list元素绑定到单个数据集_dfr

library(purrr)
library(dplyr)
library(broom)
library(tidyr)
library(stringr)
library(estimatr)

-更改了函数,使其可以在循环中工作

treatment_maker <- function(magnitude_change_level) {
    
   mtcars %>% 
      mutate(!!paste0("treatment_magnitude_", magnitude_change_level):=
         as.integer(cyl > magnitude_change_level))
 }

现在,应用该功能

map_dfr(as.numeric(1:10), ~ treatment_maker(.x) %>%
     summarise(out = list(lm_robust(reformulate(str_c('treatment_magnitude_', .x),
         response = 'mpg'), data = cur_data()) %>% tidy)) %>%
       unnest(c(out)))

-输出

# A tibble: 20 x 9
   term                   estimate std.error statistic   p.value conf.low conf.high    df outcome
   <chr>                     <dbl>     <dbl>     <dbl>     <dbl>    <dbl>     <dbl> <dbl> <chr>  
 1 (Intercept)               20.1       1.07     18.9   1.53e-18     17.9     22.3     31 mpg    
 2 treatment_magnitude_1     NA        NA        NA    NA            NA       NA       NA mpg    
 3 (Intercept)               20.1       1.07     18.9   1.53e-18     17.9     22.3     31 mpg    
 4 treatment_magnitude_2     NA        NA        NA    NA            NA       NA       NA mpg    
 5 (Intercept)               20.1       1.07     18.9   1.53e-18     17.9     22.3     31 mpg    
 6 treatment_magnitude_3     NA        NA        NA    NA            NA       NA       NA mpg    
 7 (Intercept)               26.7       1.36     19.6   1.17e-18     23.9     29.4     30 mpg    
 8 treatment_magnitude_4    -10.0       1.52     -6.57  2.84e- 7    -13.1     -6.90    30 mpg    
 9 (Intercept)               26.7       1.36     19.6   1.17e-18     23.9     29.4     30 mpg    
10 treatment_magnitude_5    -10.0       1.52     -6.57  2.84e- 7    -13.1     -6.90    30 mpg    
11 (Intercept)               24.0       1.17     20.4   3.68e-19     21.6     26.4     30 mpg    
12 treatment_magnitude_6     -8.87      1.36     -6.53  3.17e- 7    -11.6     -6.10    30 mpg    
13 (Intercept)               24.0       1.17     20.4   3.68e-19     21.6     26.4     30 mpg    
14 treatment_magnitude_7     -8.87      1.36     -6.53  3.17e- 7    -11.6     -6.10    30 mpg    
15 (Intercept)               20.1       1.07     18.9   1.53e-18     17.9     22.3     31 mpg    
16 treatment_magnitude_8     NA        NA        NA    NA            NA       NA       NA mpg    
17 (Intercept)               20.1       1.07     18.9   1.53e-18     17.9     22.3     31 mpg    
18 treatment_magnitude_9     NA        NA        NA    NA            NA       NA       NA mpg    
19 (Intercept)               20.1       1.07     18.9   1.53e-18     17.9     22.3     31 mpg    
20 treatment_magnitude_10    NA        NA        NA    NA            NA       NA       NA mpg    

检查 OP 的代码

mtcars1 <- treatment_maker(1)
> lm_robust(mpg ~ treatment_magnitude_1, data = mtcars1) %>% tidy()
1 coefficient  not defined because the design matrix is rank deficient

                   term estimate std.error statistic      p.value conf.low conf.high df outcome
1           (Intercept) 20.09062  1.065424  18.85693 1.526151e-18 17.91768  22.26357 31     mpg
2 treatment_magnitude_1       NA        NA        NA           NA       NA        NA NA     mpg

或者作为另一个例子

mtcars5 <- treatment_maker(5)
  lm_robust(mpg ~ treatment_magnitude_5, data = mtcars5) %>% tidy()
                   term  estimate std.error statistic      p.value  conf.low conf.high df outcome
1           (Intercept)  26.66364  1.359764 19.609015 1.171550e-18  23.88663 29.440645 30     mpg
2 treatment_magnitude_5 -10.01602  1.523651 -6.573696 2.839316e-07 -13.12773 -6.904307 30     mpg

推荐阅读