首页 > 解决方案 > 如何将 purrr modify_if 与具有不同参数的多个函数一起使用?

问题描述

问题解决了!

问题:

在 R 中,我一直在尝试找到一种优雅的方法来将具有不同参数的多个函数应用于包含许多 tibbles/data.frames 的列表,但是,我正在努力正确地传递参数。我正在尝试清理和预处理药品中的文本数据,并且我一直在尝试使用 modify_if、invoke、map 等。任何帮助是极大的赞赏。

注:刚开始学习编程,请见谅:)

# Set up Example Data 
Test_DataFrame <- tibble("Integer_Variable" = c(rep(x = 1:4))
             ,"Character_Variable" = c("tester to upper"
                          ,"test   squishing"
                          ,"canitcomprehend?.,-0(`kljndsfiuhaweraeriou140987645=Error?"
                          ,"         test white space triming      " ))

# With modify_if with a singular function and arguments it works: 
# Mofidy character vectors by trimming the left side of the string --= works well
modify_if(.x = Test_DataFrame
      ,.p = is.character
      ,.f = str_trim
      , side = "left") # Works well
# Expected results
# A tibble: 4 x 2
#   Integer_Variable Character_Variable                                          
#              <int> <chr>                                                       
# 1                1 "tester to upper"                                           
# 2                2 "test   squishing"                                          
# 3                3 "canitcomprehend?.,-0(`kljndsfiuhaweraeriou140987645=Error?"
# 4                4 "test white space triming      "   
####### Note the right hanging whitespace proving the arguments is being applied!

但是,当我尝试使用多个带有任何参数的函数来执行此操作时,我碰壁了(函数参数被忽略)。我已经尝试了很多 modify_if (下面的一些)和其他功能的组合,例如调用(但它已被淘汰),带 map 的 exec (这对我来说毫无意义)。至今没有成功。非常感谢任何帮助。

# does not work
modify_if(.x = Test_DataFrame
      ,.p = is.character                # = the condition to specify which column to apply the functions to  
      ,.f = c(                      # a pairwise list of "name" = "function to apply" to apply to each column where the condition = TRUE
        UpperCase = str_to_upper        # Convert strings to upper case
        ,TrimLeadTailWhiteSpace = str_trim  # trim leading and ending whitespace
        ,ExcessWhiteSpaceRemover = str_squish)  # if you find any double or more whitespaces (eg "  " or "   ") then cut it down to " " 
      , side = "left"              # its ignoring these arguments.
    )

# Does not work
modify_if(.x = Test_DataFrame
      ,.p = is.character
      ,.f = c(UpperCase = list(str_to_upper)    # listed variant doesnt work
        ,TrimLeadTailWhiteSpace = list(str_trim, side = "left")
        ,ExcessWhiteSpaceRemover = list(str_squish))
    ) # returns the integer variable instead of the character so drastically wrong

# Set up Function - Argument Table
Function_ArgumentList <- tibble("upper" = list(str_to_upper)
                   ,"trim" = list(str_trim, side = "left")
                   ,"squish" = list(str_squish))

# Doesnt work
modify_if(.x = Test_DataFrame
      ,.p = is.character
      ,.f = Function_ArgumentList)
# Error: Can't convert a `tbl_df/tbl/data.frame` object to function
# Run `rlang::last_error()` to see where the error occurred.

我意识到上面示例中使用的函数可以在没有参数的情况下通过,但是要解决我遇到的问题,这是我遇到的问题的简化示例。

解决方案:

感谢@stefan 和@BenNorris 的帮助;p 下面!为了更清楚地@stefan 的解决方案,我稍微修改了答案;

library(dplyr)
library(purrr)
library(stringr)
Test_DataFrame <- tibble("Integer_Variable" = c(rep(x = 1:4))
                        ,"Character_Variable" = c("tester to upper"
                                                ,"test   squishing"
                                                ,"canitcomprehend?.,-0(`kljndsfiuhaweraeriou140987645=Error?"
                                                ,"         test white space triming      " )
                        )
f_help <- function(x, side = "left") {
                str_to_upper(x) %>% 
                str_trim(side = side) # %>% 
                # str_squish()                # note that this is commented out
                }

modify_if(.x = Test_DataFrame
        ,.p = is.character
        ,.f = f_help
        ,side = "left") 
# A tibble: 4 x 2
# Integer_Variable Character_Variable                                          
# <int> <chr>                                                       
# 1     "TESTER TO UPPER"                                           
# 2     "TEST   SQUISHING"                                          
# 3     "CANITCOMPREHEND?.,-0(`KLJNDSFIUHAWERAERIOU140987645=ERROR?"
# 4     "TEST WHITE SPACE TRIMING      " 
                              # Note the right sided white space is still preent! It worked!!!

标签: rtidyversepurrr

解决方案


据我所知,有两种方法可以解决这个问题

  1. 使用辅助函数
  2. 利用purrr::compose
library(dplyr)
library(purrr)
library(stringr)

Test_DataFrame <- tibble("Integer_Variable" = c(rep(x = 1:4))
                         ,"Character_Variable" = c("tester to upper"
                                                   ,"test   squishing"
                                                   ,"canitcomprehend?.,-0(`kljndsfiuhaweraeriou140987645=Error?"
                                                   ,"         test white space triming      " ))

f_help <- function(x, side = "left") {
  str_to_upper(x) %>% 
    str_trim(side = side) %>% 
    str_squish()
}

modify_if(.x = Test_DataFrame,
          .p = is.character,
          .f = f_help, side = "left"
)
#> # A tibble: 4 x 2
#>   Integer_Variable Character_Variable                                        
#>              <int> <chr>                                                     
#> 1                1 TESTER TO UPPER                                           
#> 2                2 TEST SQUISHING                                            
#> 3                3 CANITCOMPREHEND?.,-0(`KLJNDSFIUHAWERAERIOU140987645=ERROR?
#> 4                4 TEST WHITE SPACE TRIMING

modify_if(.x = Test_DataFrame,
          .p = is.character,
          .f = purrr::compose(str_to_upper, ~ str_trim(.x, side = "left"), str_squish)
)
#> # A tibble: 4 x 2
#>   Integer_Variable Character_Variable                                        
#>              <int> <chr>                                                     
#> 1                1 TESTER TO UPPER                                           
#> 2                2 TEST SQUISHING                                            
#> 3                3 CANITCOMPREHEND?.,-0(`KLJNDSFIUHAWERAERIOU140987645=ERROR?
#> 4                4 TEST WHITE SPACE TRIMING

推荐阅读