首页 > 解决方案 > Using mutate with a stored list of formulas over specified columns

问题描述

This is a follow up to my previous question here, which @ronak_shah was kind enough to answer. I apologize as some of this information may be redundant to anyone who saw that post, but figure best to post a new question, rather than modify the previous version.

I would still like to iterate through a stored list of columns and procedures to create n new columns based on this list. In the example below, we start with 3 columns, a, b, c and a simple function, func1.

The data frame col_mod identifies which column should be changed, what the second argument to the function that changes them should be, and then generates a statement to execute the function. Each of these modifications should be an addition to the original data frame, rather than replacements of the specified columns. The new names of these columns should be a_new and c_new, respectively.

At the bottom of the reprex below, I am able to obtain my desired result manually, but as before, I would like to automate this using a mapping function.

I am attempting to use the same approach that was provided as an answer to my previous question, but I keep on getting the following error: "Error in get(as.character(FUN), mode = "function", envir = envir) : object 'func1(a,3)' of mode 'function' was not found"

If anyone can help would be much appreciated!

library(tidyverse)

## fake data
dat <- data.frame(a = 1:5,
                  b = 6:10,
                  c = 11:15)

## function
func1 <- function(x, y) {x + y}

## modification list
col_mod <- data.frame("col" = c("a", "c"),
                      "y_val" = c(3, 4),
                      stringsAsFactors = FALSE) %>% 
  mutate(func = paste0("func1(", col, ",", y_val, ")"))

## desired end result
dat %>% 
  mutate(a_new = func1(a, 3),
         c_new = func1(c, 4))

## attempting to generate new columns based on @ronak_shah's answer to my previous
## question but fails to run
dat[paste0(col_mod$col, '_new')] <- Map(function(x, y) match.fun(y)(x), 
                                      dat[col_mod$col], col_mod$func)

标签: r

解决方案


We can use pmap from purrr, transmute the columns based on the name from the 'col' i.e. ..1, function from the 'func' i.e. ..3 and 'y_val' from ..2, assign (:=) the value to a new column by creating a string with paste (or str_c), and bind the columns to the original dataset

library(dplyr)
library(purrr)
library(stringr)
library(tibble)
col_mod$func <- 'func1'
pmap(col_mod, ~ dat %>%
     transmute(!! str_c(..1, "_new") :=
    match.fun(..3)(!! rlang::sym(..1), ..2))) %>% 
   bind_cols(dat, .)

-output

#  a  b  c a_new c_new
#1 1  6 11     4    15
#2 2  7 12     5    16
#3 3  8 13     6    17
#4 4  9 14     7    18
#5 5 10 15     8    19

If we want to parse the function as it is, use the parse_expr and eval i.e. without changing the func column - it remains as func1(a, 3), and func1(c, 4)

pmap(col_mod, ~ dat %>%
  transmute(!! str_c(..1, "_new") := 
      eval(rlang::parse_expr(..3)))) %>%
   bind_cols(dat, .)

-output

#  a  b  c a_new c_new
#1 1  6 11     4    15
#2 2  7 12     5    16
#3 3  8 13     6    17
#4 4  9 14     7    18
#5 5 10 15     8    19

Or using base R with Map

dat[paste0(col_mod$col, '_new')] <-  do.call(Map, c(f = 
   function(x, y, z) eval(parse(text = z), envir = dat), unname(col_mod)))

推荐阅读