首页 > 解决方案 > 如何使用 ggplot 制作带有 3D 数组的箱线图?

问题描述

我有技术问题请教。

这是我观察到的数据。:

observed <- structure(c(4.06530084555243e-05, 4.34037362577724e-05, 5.25472735118296e-05, 
                        5.75250282219017e-05, 5.33322813829422e-05, 4.31323519093776e-05, 
                        2.93059438168564e-05, 3.2907253754896e-05, 3.93244409813805e-05, 
                        4.44607200813546e-05, 4.28121839343577e-05, 4.41339340180233e-05, 
                        2.45819615043229e-05, 2.77652788697063e-05, 3.471280169582e-05, 
                        4.0759303004447e-05, 4.1444945573338e-05, 3.91053759171617e-05
), .Dim = c(6L, 3L))

模拟后我有这个数据集:

simul <- structure(c(4.19400641566714e-05, 4.34037362577724e-05, 5.21778240776188e-05, 
                        5.72766282640455e-05, 5.33322813829422e-05, 4.4984474595369e-05, 
                        3.04758260711529e-05, 3.35466566427138e-05, 4.07527347018512e-05, 
                        4.51672959887775e-05, 4.42496416020706e-05, 4.41339340180233e-05, 
                        2.38725672336555e-05, 2.78960210968267e-05, 3.42390390339277e-05, 
                        4.0759303004447e-05, 4.1444945573338e-05, 4.16181419135288e-05, 
                        4.06530084555243e-05, 4.52163381730998e-05, 5.37744538705153e-05, 
                        5.75250282219017e-05, 5.44384786782902e-05, 4.27640158845638e-05, 
                        2.93059438168564e-05, 3.16988003284864e-05, 3.88757470111112e-05, 
                        4.16839537839391e-05, 4.1923490779897e-05, 4.43697930071784e-05, 
                        2.53312977844189e-05, 2.82780740113101e-05, 3.49483644305925e-05, 
                        4.23308636691264e-05, 4.36574393087853e-05, 3.91053759171617e-05, 
                        3.97856427517231e-05, 4.25485977213641e-05, 5.21380124071012e-05, 
                        5.62879076217168e-05, 5.18161751345512e-05, 4.22404154190924e-05, 
                        2.84842421189343e-05, 3.2907253754896e-05, 3.93244409813805e-05, 
                        4.28921326811218e-05, 4.2391125283836e-05, 4.28233487269764e-05, 
                        2.45819615043229e-05, 2.67311845213199e-05, 3.3715109777394e-05, 
                        4.00991849427121e-05, 4.07259705233212e-05, 3.62825448554739e-05, 
                        3.95854341194398e-05, 4.23930151174446e-05, 5.25472735118296e-05, 
                        5.76202168197769e-05, 5.23957149070388e-05, 4.31323519093776e-05, 
                        2.90350657890489e-05, 3.22693947104228e-05, 3.90988677457566e-05, 
                        4.44607200813546e-05, 4.28121839343577e-05, 4.28542288317551e-05, 
                        2.56149959419174e-05, 2.77652788697063e-05, 3.49302533009518e-05, 
                        4.13777396322285e-05, 4.12908495437265e-05, 3.92084109551252e-05, 
                        4.14887591359563e-05, 4.39273564362111e-05, 5.31197050290816e-05, 
                        5.77484133948985e-05, 5.36319646972061e-05, 4.62472643466539e-05, 
                        3.06756490605887e-05, 3.49917045844483e-05, 4.15936967740209e-05, 
                        4.66221720234964e-05, 4.48785430220286e-05, 4.44766996381653e-05, 
                        2.36916432633518e-05, 2.69248181080789e-05, 3.471280169582e-05, 
                        3.94762090257435e-05, 4.17765202936009e-05, 3.8021359310749e-05
), .Dim = c(6L, 3L, 5L))

这是一个 3 维的 3D 数组。列对应于研究区域,行对应于“月份”。第三维对应于模拟的值。

我的问题:是否可以使用 ggplot 呈现多面板图(网格) - 1 个研究区域的 1 个面板 - 箱线图模拟(第三维的值),月份在“x 轴”(= 每个面板 6 个箱线图) ) ? 我还想绘制通过每个面板的箱线图观察到的值的线条。谢谢 !

标签: rggplot2

解决方案


我希望我理解正确:对于每种类型的研究 - 制作每个月的箱线图,总结从所有 5 次模拟中获得的值。

首先,我将维度名称赋予数组:

attributes(simul)$dimnames <- list(
  month  = month.abb[1:6], 
  study  = letters[1:3], 
  simval = 1:5
  )

之后,我将命名数组转换为cube_tibble, 并进一步转换为 ,tibble因此我可以使用通常的tidyverse例程绘制数据:

library(tidyverse)
library(magrittr)

as.tbl_cube(simul)                                       %>%
  as_tibble()                                            %>%
  rename('value' = simul)                                %>%
  mutate(
    study  = factor(paste('Study', study)),
    month  = factor(month, levels = month.abb[1:6])
    )                                                   %T>%
  print                                                  %>%
  ggplot(aes(x = month, y = value))                        +
  geom_boxplot(outlier.colour = 'red')                     +
  facet_wrap(~ study, nrow = 1, scale = 'free_y')          +
  ggthemes::theme_few()                                      

# # A tibble: 90 x 4
#    month study   simval     value
#    <fct> <fct>    <int>     <dbl>
#  1 Jan   Study a      1 0.0000419
#  2 Feb   Study a      1 0.0000434
#  3 Mar   Study a      1 0.0000522
#  4 Apr   Study a      1 0.0000573
#  5 May   Study a      1 0.0000533
#  6 Jun   Study a      1 0.0000450
#  7 Jan   Study b      1 0.0000305
#  8 Feb   Study b      1 0.0000335
#  9 Mar   Study b      1 0.0000408
# 10 Apr   Study b      1 0.0000452
# # ... with 80 more rows

箱线图问题 58936328


推荐阅读