首页 > 解决方案 > 带有百分比且不折叠数据的ggplot条形图?

问题描述

我有一个看起来像这样的数据框:

data <- structure(list(Sex = c("Male", "Male", "Male", "Male", "Female", 
                               "Male", "Female", "Female", "Female", "Female", "Male", "Female", 
                               "Female", "Female", "Male"), Nationality = c("USA", "USA", "USA", 
                                                                            "UK", "UK", "UK", "France", "France", "France", "France", "France", 
                                                                            "USA", "Canada", "Canada", "Mexico")), row.names = c(NA, 15L), class = "data.frame")

我已经这样绘制了它:

ggplot(data, aes(x = factor(Nationality))) +  
  geom_bar(aes(y = (..count..)/sum(..count..), fill = Sex), width = 0.3) +
  scale_y_continuous(labels = percent, limits = c(0, 0.4))+
  coord_flip()

我想做两件事:

(1) 以降序重新排列条形图,使第一个条形图是计数最高的条形图。我已经尝试过reorder在stackoverflow上的其他问题中发现,但我无法使其工作。是因为我使用百分比吗?请注意,我不想在图表中使用计数总和,因为我仍然希望能够在图中表示性别(即,数据不得折叠)。我相信这个特定的问题以前没有得到答复。

(2) 在每个条形内添加一个带有计数值的标签。我尝试了以下方法,但没有奏效。问题是我不知道如何在这种情况下引用计数。

geom_text(aes(label = Nationality), nudge_y = +1)

笔记。澄清我所说的不折叠数据的意思:我知道我可以变异并创建一个新的数据框,其中包含每个国籍的计数总和。但随后我会丢失每种性别的计数(数据将被折叠),因此我无法再在图中表示性别。

标签: rggplot2

解决方案


这对你有用吗?

library(dplyr)
library(forcats)
library(scales)

data %>%
  # convert Nationality to factor with levels sorted according to 
  # each Nationality's total count, in reverse (i.e. descending) order
  mutate(Nationality = fct_rev(fct_infreq(Nationality))) %>%

  # aggregate by both Nationality & Sex, and calculate percentage
  count(Nationality, Sex) %>%
  mutate(p = n/sum(n)) %>%

  ggplot(aes(x = Nationality, y = p, label = n, fill = Sex)) +
  geom_col(width = 0.3) +
  geom_text(position = position_stack(vjust = 0.5)) +
  scale_y_continuous(labels = percent, limits = c(0, 0.4)) +
  coord_flip()

阴谋


推荐阅读