首页 > 解决方案 > 将累积数量添加到使用 facet_wrap 绘制的 geom_bar 图中

问题描述

新手来了!经过长时间的搜索,我仍然找不到令人满意的解决方案。我有一个心力衰竭率数据集(https://archive.ics.uci.edu/ml/datasets/Heart+failure+clinical+records),我想显示一系列几何图,其中“Sruvived”和“死亡”按类别计算(即性别、吸烟等)。

我认为我在准备情节方面做得不错,而且它们对我来说很合适。问题是,很难看出不同特征的存活和死亡病人的比例如何。

我有两个,但他们都躲过了我:

这是我写的代码。


    library(ggplot)
    
    heart_faliure_data <- read.csv(file = "heart_failure_clinical_records_dataset.csv", header = FALSE, skip=1)
    
    #Prepare Column Names
    c_names <- c("Age",
                 "Anaemia",
                 "creatinine_phosphokinase",
                 "diabetes",
                 "ejection_fraction",
                 "high_blood_pressure",
                 "platelets",
                 "serum_creatinine",
                 "serum_sodium",
                 "sex",
                 "smoking",
                 "time",
                 "DEATH_EVENT")
    
    
    #Apply column names to the dataframe
    colnames(heart_faliure_data) <- c_names
    
    
    # Some Classes like sex, Anaemia, diabetes, high_blood_pressure smoking and DEATH_EVENT are booleans
    # (see description of Dataset) and should be transformed into factors
    heart_faliure_data$sex <- factor(heart_faliure_data$sex, 
                                     levels=c(0,1), 
                                     labels=c("Female","Male"))
    heart_faliure_data$smoking <- factor(heart_faliure_data$smoking, 
                                         levels=c(0,1), 
                                         labels=c("No","Yes"))
    heart_faliure_data$DEATH_EVENT <- factor(heart_faliure_data$DEATH_EVENT, 
                                             levels=c(0,1), 
                                             labels=c("Survived","Died"))
    heart_faliure_data$high_blood_pressure <- factor(heart_faliure_data$high_blood_pressure, 
                                                     levels=c(0,1), 
                                                     labels=c("No","Yes"))
    heart_faliure_data$Anaemia <- factor(heart_faliure_data$Anaemia, 
                                         levels=c(0,1), 
                                         labels=c("No","Yes"))
    heart_faliure_data$diabetes <- factor(heart_faliure_data$diabetes, 
                                          levels=c(0,1), 
                                          labels=c("No","Yes"))
    # Adjust Age to a int value
    heart_faliure_data$Age <- as.integer(heart_faliure_data$Age)
    
    
    # selecting the categorical variables and study the effect of each variable on death-event
    categorical.heart_failure <- heart_faliure_data  %>%
      select(Anaemia,
             diabetes,
             high_blood_pressure,
             sex,
             smoking,
             DEATH_EVENT) %>%
      gather(key = "key", value = "value", -DEATH_EVENT)
    
    
    #Visualizing this effect with a grouped barplot
    categorical.heart_failure %>% 
      ggplot(aes(value)) +
      geom_bar(aes(x        = value, 
                   fill     = DEATH_EVENT), 
                   alpha    = .2, 
                   position = "dodge", 
                   color    = "black",
                   width    = .7,
                   stat = "count") +
      labs(x = "",
           y = "") +
      theme(axis.text.y  = element_blank(),
            axis.ticks.y = element_blank()) +
      facet_wrap(~ key, 
                 scales = "free", 
                 nrow = 4) +
      scale_fill_manual(values = c("#FFA500", "#0000FF"), 
                        name   = "Death Event", 
                        labels = c("Survived", "Dead"))

这是结果的(不是那么糟糕)图像: 在此处输入图像描述

目标是在条形图上显示一些数值。甚至只是一个指示...

我会很高兴你能给我任何帮助!

标签: rggplot2categorical-datageom-barfacet-wrap

解决方案


这样的事情怎么办。为了使它起作用,我首先汇总了数据:

tmp <- categorical.heart_failure %>% 
  group_by(DEATH_EVENT, key, value) %>% 
  summarise(n = n())


#Visualizing this effect with a grouped barplot
tmp %>% 
  ggplot(aes(x = value, y=n)) +
  geom_bar(aes(fill     = DEATH_EVENT), 
           alpha    = .2, 
           position = position_dodge(width=1), 
           color    = "black",
           width    = .7,
           stat = "identity") +
  geom_text(aes(x=value, y=n*1.1, label = n, group=DEATH_EVENT), position = position_dodge(width=1), vjust=0) + 
  labs(x = "",
       y = "") +
  theme(axis.text.y  = element_blank(),
        axis.ticks.y = element_blank()) +
  facet_wrap(~ key, 
             scales = "free", 
             nrow = 4) +
  scale_fill_manual(values = c("#FFA500", "#0000FF"), 
                    name   = "Death Event", 
                    labels = c("Survived", "Dead")) + 
  coord_cartesian(ylim=c(0, max(tmp$n)*1.25))

在此处输入图像描述


推荐阅读