首页 > 解决方案 > 对数据框中的组执行多个fisher.test

问题描述

我想在下面的数据框中fisher.test()为每一列(下例中的 ABCB1 和 ABL1)执行多个操作。tax列联表应从如下所示的行中提取。

编辑:

列联表的计算应如下例所示:

   data_frame(c(42,1),c(20,3))

意外事件示例:

             ABCB1      NotABCB1
tax1Present      42     1
tax1NotPresent   20     3



42 is 43-1
1 is the cell value Present:tax1Present
20 is 23-3
3 is the cell value in NotPresent: tax1NotPresent

数据:

structure(list(group = c("tax1Present", "tax1NotPresent", "tax2Present", 
"tax2NotPresent", "tax3Present", "tax3NotPresent", "tax4Present", 
"tax4NotPresent", "tax5Present", "tax5NotPresent"), ABCB1 = c(1L, 
3L, 4L, 5L, 3L, 6L, 6L, 3, 2, 6L), ABL1 = c(18L, 14, 12L, 
9L, 1L, 5L, 0L, 0L, 7L, 0L), Present = c(43L, 43, 23L, 23, 
9L, 9, 7L, 7, 20, 20L),NotPresent = c(23, 23, 18, 18, 
7L, 7L, 10L, 10L, 10, 10L), tax = c("tax1", "tax1", "tax2", 
"tax2", "tax3", "tax3", "tax4", "tax4", "tax5", "tax5")), row.names = c(NA, 
10L), class = "data.frame")


> df
            group ABCB1 ABL1 Present NotPresent  tax
1     tax1Present     1   18    43   23          tax1
2  tax1NotPresent     3   14    43   23          tax1
3     tax2Present     4   12    23   18         tax2
4  tax2NotPresent     5    9    23   18          tax2
5     tax3Present     3    1    9    7         tax3
6  tax3NotPresent     6    5    9    7        tax3
7     tax4Present     6    0    7    10         tax4
8  tax4NotPresent     3    0    7    10         tax4
9     tax5Present     2   7    20   10          tax5
10 tax5NotPresent     6    0    20   10          tax5

标签: rdataframetidyverse

解决方案


尝试使用应用:

数据:

df <- structure(list(group = c("tax1Present", "tax1NotPresent", "tax2Present", 
"tax2NotPresent", "tax3Present", "tax3NotPresent", "tax4Present", 
"tax4NotPresent", "tax5Present", "tax5NotPresent"), ABCB1 = c(1L, 
3L, 4L, 5L, 3L, 6L, 6L, 12L, 13L, 6L), ABL1 = c(18, 14, 12, 9, 
1, 5, 0, 0, 7, 0), Present = c(43, 43, 23, 23, 9, 9, 7, 7, 20, 
20), NotPresent = c(23, 23, 18, 18, 7, 7, 1, 13, 10, 10), tax = c("tax1", 
"tax1", "tax2", "tax2", "tax3", "tax3", "tax4", "tax4", "tax5", 
"tax5")), row.names = c(NA, 10L), class = "data.frame")
# set the columns to use
columns <- c("ABCB1", "ABL1")

dat_test <- sapply( column, function(colx) 
  lapply( unique(df$tax), function(x) 
    fisher.test( data.frame( 
      a=c(( df[ which(df$tax %in% x)[1] ,"Present"] - 
      df[ which(df$tax %in% x)[1], colx] ), df[ which(df$tax %in% x)[1], colx]), 
      b=c(( df[ which(df$tax %in% x)[2],"NotPresent"] - 
      df[ which(df$tax %in% x)[2], colx] ), df[ which(df$tax %in% x)[2], colx]) ))
 ) )

# set rownames
rownames(dat_test) <- unique( df$tax )

dat_test
     ABCB1  ABL1  
tax1 List,7 List,7
tax2 List,7 List,7
tax3 List,7 List,7
tax4 List,7 List,7
tax5 List,7 List,7

测试:

#p-values == manual calculation [and exactly the same
#values as with previous df$Total]:
     ABCB1      ABL1      
tax1 0.1179487  0.1971581 
tax2 0.4709802  1         
tax3 0.06013986 0.03496503
tax4 1          1         
tax5 1          0.06371942

另请参阅:https ://stats.stackexchange.com/questions/332224/2x2-fisher-exact-test-contingency-tables


推荐阅读