首页 > 解决方案 > 在 R 中用异常值连续缩放 ggplot 中的 y 轴

问题描述

我有一个关于使用ggplotin可视化数据的问题R。具体来说,关于在异常值的情况下 y 轴的缩放。

让我们从一个包含来自 31 个 ID 的观察的样本数据集开始。30 个 ID 在预期范围内,并且有一个异常值:

# Load libraries
library(tidyverse)
library(ggbeeswarm)
library(data.table)

# Set seed
set.seed(123)

# Create dataset
ID <- sprintf("ID-%s",seq(1:30))
baseline <- rnorm(30, mean = 50, sd = 3)

df <- data.frame(ID, baseline) %>%
  mutate(`1` = baseline - rnorm(1, mean = 5, sd = 4), 
         `2` = `1` - rnorm(1, mean = 3, sd = 5), 
         `3` = `2` - rnorm(1, mean = 1, sd = 3)) 

# Add outlier
df <- as.data.frame(rbindlist(list(df, list("ID-31", 0.01, 0.02, 0.03 ,1))))

df <- df %>% 
  pivot_longer(-ID) %>% 
  rename(time = name) %>% 
  mutate(time = as.factor(time))

#Plot
ggplot(data = df, aes(x=time, y = value)) + 
  geom_quasirandom() +
  theme_classic() + 
  scale_x_discrete(limits = c("baseline", "1", "2", "3") ) +
  labs(x = "Time", y = "Value")

在此处输入图像描述

预期产出

由于图表上部的变化不太明显,我想以一种显示所有值但专注于绘图的某个部分的方式缩放 x 轴(在这种情况下,值在 20 到 50 之间) .

在此处输入图像描述

问题

是否可以以这种方式缩放 x 轴?

附加信息

我特别不是在寻找数据转换解决方案。此外,我知道scale_y_continuous函数 inggplot和 itlimits参数,但这省略了一部分数据。

标签: rggplot2scale

解决方案


I don''t know anything about having a broken y-axis with ggplot, but this achieves something similar if you can specify in advance which ID is going to be the outlier.

library(tidyverse)
library(ggbeeswarm)
library(data.table)

# Set seed
set.seed(123)

# Create dataset
ID <- sprintf("ID-%s",seq(1:30))
baseline <- rnorm(30, mean = 50, sd = 3)

df <- data.frame(ID, baseline) %>%
  mutate(`1` = baseline - rnorm(1, mean = 5, sd = 4), 
         `2` = `1` - rnorm(1, mean = 3, sd = 5), 
         `3` = `2` - rnorm(1, mean = 1, sd = 3)) 

# Add outlier
df <- as.data.frame(rbindlist(list(df, list("ID-31", 0.01, 0.02, 0.03 ,1))))

df <- df %>% 
  pivot_longer(-ID) %>% 
  rename(time = name) %>% 
  mutate(time = as.factor(time),
         is_outlier = (as.character(ID) == "ID-31"))

ggplot(data = df, aes(x=time, y = value)) + 
  geom_point() + 
  facet_grid(rows = vars(is_outlier), 
             scales = "free_y",
             switch = "y") +
  theme_classic() + 
  scale_x_discrete(limits = c("baseline", "1", "2", "3") ) +
  labs(x = "Time", y = "Value")

推荐阅读