首页 > 解决方案 > 使用 fct_relevel() 绘制删除 NA

问题描述

我有包含一些 NA 值的数据,我正在尝试制作如下图:

library(ggplot2)
library(forcats)
library(dplyr)
library(ggpubr)

df<-data.frame(Y = rnorm(20, -6, 1),
                  X = sample(c("yes", "no", NA), 20, replace = TRUE))

dfplot<- df   %>% mutate(X=fct_relevel(X, "yes"))%>%
  ggplot(.,
         aes(x=X, y=Y, fill=X))+
  geom_boxplot(size=1, width = 0.2, show.legend = F, outlier.shape = NA,
               position=position_nudge(x=0.3))+
  geom_jitter(show.legend = T, shape=21, width=0.2, size=2)+
  geom_crossbar(data=df %>% group_by(X) %>% summarise(mean=mean(Y), .groups="keep"),
                aes(x=X, ymin=mean, ymax=mean, y=mean), width = 0.2, show.legend = F)+

  labs(x="",
       y="%")

dfplot

在此处输入图像描述

但是,当我尝试仅绘制“是”和“否”变量时,使用 filter(X!="NA") 删除“NA”时,我无法将它们重新调整为正确的顺序,并将“是”作为第一列。如果我使用drop_na("X")orfilter(!is.na(X))代替filter(X!="NA")

dfplot<- df %>% filter(X!="NA")  %>% mutate(X=fct_relevel(X, "yes"))%>%
  ggplot(.,
         aes(x=X, y=Y, fill=X))+
  geom_boxplot(size=1, width = 0.2, show.legend = F, outlier.shape = NA,
               position=position_nudge(x=0.3))+
  geom_jitter(show.legend = T, shape=21, width=0.2, size=2)+
  geom_crossbar(data=df %>% group_by(X) %>% summarise(mean=mean(Y), .groups="keep"),
                aes(x=X,ymin=mean, ymax=mean, y=mean), width = 0.2, show.legend = F)+

  labs(x="",
       y="%")

dfplot

在此处输入图像描述

标签: rggplot2forcats

解决方案


我认为原因是因为您在“geom_crossbar”中提供了相同的数据,而没有指定删除“NA”值。

尝试在代码块的开头“set.seed”以使其完全可重现。

下面应该产生一个正确水平的“是”和“否”的图。

library(ggplot2)
library(forcats)
library(dplyr)
library(ggpubr)

set.seed(123456)

df <- data.frame(Y = rnorm(20, -6, 1),
               X = sample(c("yes", "no", NA), 20, replace = TRUE))


dfplot <- df %>% filter(!is.na(X))  %>% mutate(X=fct_relevel(X, 'yes')) %>%
    ggplot(.,
           aes(x=X, y=Y, fill=X))+
    geom_boxplot(size=1, width = 0.2, show.legend = F, outlier.shape = NA,
                 position=position_nudge(x=0.3))+
    geom_jitter(show.legend = T, shape=21, width=0.2, size=2)+
    geom_crossbar(data=df %>% filter(!is.na(X)) %>% group_by(X) %>% summarise(mean=mean(Y), .groups="keep"),
                  aes(x=X,ymin=mean, ymax=mean, y=mean), width = 0.2, show.legend = F)+
    
    labs(x="",
         y="%")

dfplot


推荐阅读