首页 > 解决方案 > 功能未获得检测 NA 观测值的水平

问题描述

我正在尝试创建一个函数,该函数对两个变量中的缺失/缺失值是否重叠进行交叉制表。

该函数采用两个变量和数据集。它看起来像这样:

absent_2by2 <- function(var1, var2, data){
  
  require(tidyverse)
  require(gtsummary)
  require(data.table)
  
  
  data %>% 
    as.data.table() %>% 
    mutate(var1_c = 0) %>% 
    .[!is.na(var1), var1_c := 1] %>% 
    .[is.na(var1), var1_c := 2] %>%
    mutate(var1_c = as.factor(var1_c),
           var1_c = fct_recode(var1_c,
                               "Present" = "1",
                               "Absent" = "2")
           ) %>% 
    mutate(var2_c = 0) %>% 
    .[!is.na(var2), var2_c := 1] %>% 
    .[is.na(var2), var2_c := 2] %>% 
    mutate(var2_c = as.factor(var2_c),
           var2_c = fct_recode(var2_c,
                               "Present" = "1",
                               "Absent" = "2")
           ) %>% 
           gtsummary::tbl_cross(data, 
                                 var2_c, var1_c,
                                 percent = "no")
  }

当我使用以下代码调用该函数时:

absent_2by2("Ozone", "Solar.R", airquality)

...输出如下所示:

在此处输入图像描述

...这是我得到的错误:

Warning messages:
1: Problem with `mutate()` column `var1_c`.
ℹ `var1_c = fct_recode(var1_c, Present = "1", Absent = "2")`.
ℹ Unknown levels in `f`: 2 
2: Problem with `mutate()` column `var2_c`.
ℹ `var2_c = fct_recode(var2_c, Present = "1", Absent = "2")`.
ℹ Unknown levels in `f`: 2 

似乎该函数没有拾取我的两个变量的第 2 级。不知道为什么会这样,因为当我将代码作为一个管道串在一起时,我得到了正确的输出。独立代码如下所示:

 require(tidyverse)
 require(gtsummary) 
 require(data.table)

    airquality %>% 
      as.data.table() %>% 
      mutate(var1_c = 0) %>% 
      .[!is.na(Ozone), var1_c := 1] %>% 
      .[is.na(Ozone), var1_c := 2] %>%
      mutate(var1_c = as.factor(var1_c),
             var1_c = fct_recode(var1_c,
                                 "Present" = "1",
                                 "Absent" = "2")
      ) %>% 
      mutate(var2_c = 0) %>% 
      .[!is.na(Solar.R), var2_c := 1] %>% 
      .[is.na(Solar.R), var2_c := 2] %>% 
      mutate(var2_c = as.factor(var2_c),
             var2_c = fct_recode(var2_c,
                                 "Present" = "1",
                                 "Absent" = "2")
      ) %>%  
      gtsummary::tbl_cross(., 
                                   var2_c, var1_c,
                                   percent = "no"
    )

输出如下所示:

在此处输入图像描述

如果有人可以指导我,我将不胜感激。谢谢!

标签: rfunctiondplyrgtsummary

解决方案


我认为这应该对你有用。

absent_2by2 <- function(data, var1, var2) {
  # make var1 and var2 binary factors factors for NA values
  data <-
    dplyr::mutate(
      data,
      dplyr::across(
        .cols = all_of(c(var1, var2)),
        .fns = ~factor(is.na(.), 
                       levels = c(FALSE, TRUE), 
                       labels = c("Present", "Absent"))
      )
    )
  
  # cross tabulate missing values
  gtsummary::tbl_cross(data, row = all_of(var1), col = all_of(var2))
}

absent_2by2(gtsummary::trial, "age", "trt")

在此处输入图像描述


推荐阅读