首页 > 解决方案 > 使用 R 对 Sankey/Alluvial 图中的一些变量进行美化和排序

问题描述

我正在努力提高我在数据可视化方面的技能,我几乎得到了我想要的。但在某些时候,我被卡住了,无法继续前进。请注意,伙计们,我在这里进行了广泛的研究以试图找出我的疑问,这对我有很大帮助。

这是我的数据集:

https://app.box.com/s/pp5p5chgypn6ba33anotie7wlxvdu01v

这是我的代码:

library(tidyverse)
library(ggalluvial)
library(alluvial)

A_col <- "firebrick3"
B_col <- "darkorange"
C_col <- "aquamarine2"
D_col <- "dodgerblue2"
E_col <- "darkviolet"
F_col <- "chartreuse2"
G_col <- "goldenrod1"
H_col <- "gray73"
set.seed(39)

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  geom_alluvium(aes(fill = Positions, color = Positions), 
        width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum(width = 4/12, color = "grey36") +
  geom_text(stat = "stratum", label.strata = TRUE) +
  scale_x_continuous(breaks = 1:3, 
       labels = c("Activity", "Category", "Positions/Movements"), expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  scale_fill_manual(values  = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  ggtitle("Physical Activity during the week and weekend") +
  theme_minimal() +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))

# I also have this code that I run without pre-choosing the colours.
# I like this one because the flow diagram doesn't have any border.

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  scale_x_discrete(limits = c("Activity", "Category", "Positions/Moviments"), 
       expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  geom_alluvium(aes(fill = Positions), width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum() + geom_text(stat = "stratum", label.strata = TRUE) +
  theme_minimal() +
  ggtitle("Physical Activity during the week and weekend") +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))

这是可视化: 在此处输入图像描述

我真的做不到三件事:

  1. 对一周和周末之后的清晰视图进行排序,Category例如WorkingNon Working、和。Sleep WeekLeisureSleep Weekend

  2. 对位置/运动进行排序,例如SittingLyingStandingMovingStairs、和。另外,我想用与流程图相同的颜色填充此列的方块。另一件事是有些名字没有足够的空间,我不知道是否可以重新设置空间以容纳它们,或者将它们放在外面用箭头指示属于它们的正方形。差点忘了,有什么方法可以手动将颜色分配给每个变量,例如 color for ?另外,如果可能的话,我想从流程图的边缘取出线条。Walk SlowWalk FastRunningblackWalk Slow

  3. 有没有办法堆叠名称位置和运动?

有什么方法可以改善这种可视化并使其美观吗?

在此先感谢,路易斯

标签: rggplot2chartsgraph-visualizationsankey-diagram

解决方案


这是一个解决您的一些问题的解决方案。

df <- read_csv('Desktop/plot_alluvial_category_position_plus_moviments.csv')
positions <- c("Sitting", "Lying", "Standing", "Moving", "Stairs", "Walk Slow",
               "Walk Fast", "Running")
df$Positions <- factor(df$Positions, levels = positions, labels = positions)
category <- c("Working", "Non Working", "Sleep Week", "Leisure", 
              "Sleep Weekend")
df$Category <- factor(df$Category, levels = category, labels = category)

ggplot(df,
       aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
  geom_alluvium(aes(fill = Positions), 
                width = 4/12, alpha = 0.5, knot.pos = 0.3) +
  geom_stratum(width = 4/12, color = "grey36") +
  geom_text(stat = "stratum", label.strata = TRUE, min.height=100) +
  scale_x_continuous(breaks = 1:3, 
                     labels = c("Activity", "Category", "Positions\nMovements"), expand = c(.01, .05)) +
  ylab("Time 24 hours") +
  scale_fill_manual(values  = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
  ggtitle("Physical activity during the week and weekend") +
  theme_minimal() +
  theme(legend.position = "none", panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), axis.text.y = element_blank(), 
        axis.text.x = element_text(size = 12, face = "bold"))
  1. 要对您的分层进行排序,您需要将您的CategoryPosition列转换为您设置级别顺序的因素。
  2. 要移除流程图的边缘,color = Position从您的aes关卡中移除就足够了。
  3. 您可以通过在标签中添加换行符来堆叠名称位置和移动。
  4. 您可以将颜色分配给分层,但前提是类别始终相同(查看ggalluvial文档中的一些示例)。
  5. 为了避免小层的重叠,您可以使用version中引入的min.height参数,如下所示geom_textggalluvial0.9.2

推荐阅读