首页 > 解决方案 > 来自 Dunnett 秩检验的成对相关性

问题描述

我想在热图中表示 Dunnett 的测试结果,突出显示组之间的相关性。

输出:

                           mean.rank.diff    pval    
EpisodeFourL-EpisodeFiveL      -51.418401 0.33175    
EpisodeOneL-EpisodeFiveL        38.505311 1.00000    
EpisodeSixL-EpisodeFiveL        34.267816 1.00000    
EpisodeThreeL-EpisodeFiveL     -68.548095 0.07237 .  
EpisodeTwoL-EpisodeFiveL       -93.324843 0.00504 ** 
EpisodeOneL-EpisodeFourL        89.923712 0.03094 *  
EpisodeSixL-EpisodeFourL        85.686217 0.12094    
EpisodeThreeL-EpisodeFourL     -17.129694 1.00000    
EpisodeTwoL-EpisodeFourL       -41.906442 0.60473    
EpisodeSixL-EpisodeOneL         -4.237495 1.00000    
EpisodeThreeL-EpisodeOneL     -107.053407 0.00484 ** 
EpisodeTwoL-EpisodeOneL       -131.830154 0.00024 ***
EpisodeThreeL-EpisodeSixL     -102.815911 0.03506 *  
EpisodeTwoL-EpisodeSixL       -127.592659 0.00484 ** 
EpisodeTwoL-EpisodeThreeL      -24.776748 1.00000 

如何制作“p 值的相关矩阵”,使其如下所示,其中记录的单元格the mean rank diffp-values?

谢谢你的时间

在此处输入图像描述

我正在努力执行以下步骤:

  1. 成对比较 - 如何安排我的数据在 2 个轴上显示剧集名称;
  2. 如何将剧集分成 2 组 M 和 L;
  3. 如何使用mean rank diff单元格中的值创建相关热图并p-values用于为单元格着色

样本数据:

df<-structure(list(mean.rank.diff = c(31.793661, 50.78439, -93.432344, 
-61.09784, -30.52092, -43.07989, 26.230952, 65.94858, 11.569245, 
20.41009, -125.226005, -111.88223, -62.31458, -93.86428, -5.562709, 
15.16419, -20.224416, -30.3743, 62.911425, 18.01795, 119.663297, 
127.04642, 105.00159, 81.50793, 56.751872, 109.02847, 42.090165, 
63.48998, -14.661707, -45.53849), pval = c(1, 0.43984, 0.03031, 
0.37802, 1, 1, 1, 0.1446, 1, 1, 0.00049, 0.00207, 0.85499, 0.10108, 
1, 1, 1, 1, 1, 1, 0.00098, 0.00033, 0.00782, 0.09761, 1, 0.03568, 
1, 0.60994, 1, 0.60994)), class = "data.frame", row.names = c("EpisodeFourL-EpisodeFiveL", 
"EpisodeFourM-EpisodeFiveM", "EpisodeOneL-EpisodeFiveL", "EpisodeOneM-EpisodeFiveM", 
"EpisodeSixL-EpisodeFiveL", "EpisodeSixM-EpisodeFiveM", "EpisodeThreeL-EpisodeFiveL", 
"EpisodeThreeM-EpisodeFiveM", "EpisodeTwoL-EpisodeFiveL", "EpisodeTwoM-EpisodeFiveM", 
"EpisodeOneL-EpisodeFourL", "EpisodeOneM-EpisodeFourM", "EpisodeSixL-EpisodeFourL", 
"EpisodeSixM-EpisodeFourM", "EpisodeThreeL-EpisodeFourL", "EpisodeThreeM-EpisodeFourM", 
"EpisodeTwoL-EpisodeFourL", "EpisodeTwoM-EpisodeFourM", "EpisodeSixL-EpisodeOneL", 
"EpisodeSixM-EpisodeOneM", "EpisodeThreeL-EpisodeOneL", "EpisodeThreeM-EpisodeOneM", 
"EpisodeTwoL-EpisodeOneL", "EpisodeTwoM-EpisodeOneM", "EpisodeThreeL-EpisodeSixL", 
"EpisodeThreeM-EpisodeSixM", "EpisodeTwoL-EpisodeSixL", "EpisodeTwoM-EpisodeSixM", 
"EpisodeTwoL-EpisodeThreeL", "EpisodeTwoM-EpisodeThreeM"))

标签: rggplot2

解决方案


也许这就是你要找的

  1. 利用dplyr,你可以将你的行名分成剧集和tidyrstringr
  2. 数据整理后,您可以通过geom_tile,geom_textfacet_grid
  3. 最后,我做了一些调整,把分面标签放在外面,把 x 轴放在上面。
library(ggplot2)
library(tidyr)
library(dplyr)

levels <- paste0("Episode", c("One", "Two", "Three", "Four", "Five", "Six"))
labels <- paste("Episode", c("One", "Two", "Three", "Four", "Five", "Six"))
df1 <- df %>% 
  mutate(episodes = row.names(.)) %>% 
  separate(episodes, into = c("episode1", "episode2")) %>% 
  mutate(type1 = stringr::str_extract(episode1, ".$"), 
         type2 = stringr::str_extract(episode1, ".$"),
         across(c(episode1, episode2), ~ stringr::str_remove(., ".$")),
         across(c(episode1, episode2), ~ factor(., levels = levels, labels = labels)),
         across(c(type1, type2), ~ factor(., levels = c("M", "L"))))

ggplot(df1, aes(type1, forcats::fct_rev(type2), fill = pval)) +
  geom_tile() +
  geom_text(aes(label = scales::number(mean.rank.diff, accuracy = .1))) +
  facet_grid(episode1 ~ episode2, switch = "y") +
  scale_x_discrete(position = "top") +
  theme(strip.placement = "outside") +
  labs(x = NULL, y = NULL)


推荐阅读