首页 > 解决方案 > 具有 position_dodge 的不同因子水平数量的中心 stat_summary 平均值

问题描述

我对正确躲避 with 有stat_summary疑问position_dodge。我会用diamond数据来说明。目标是可视化用于预测的变量(这里我们使用price)如何分布在我要预测的类之间(carat > 1carat < 1

因此,我将连续carat值分为两类carat > 1carat < 1. 计算两个类别的平均价格。在另一个步骤中,在填充美学carat 的调用中再次分箱。ggplot2第一个填充箱0-1与 class 一起使用carat < 1。在我的实际数据中,第一类是“缺席”,第二类是“存在”。因此,其他填充箱属于类carat > 1,并允许更详细地显示分布。

钻石图

我的问题:geom_point无论填充箱的数量如何,是否可以将每个班级的平均值居中?在这种情况下,蓝色圆点应位于蓝色圆圈列的中心,红色圆点应位于当前填充箱数量(2 或 3)的中心。

如果一直在玩价值,stat_summary(position=position_dodge(width = 0.9) 但我只是没有完全到达那里。

library(ggplot2)
data("diamonds")
set.seed(10)
diamond_subset <- diamonds[sample(nrow(diamonds), 1500),]
diamond_subset$carat[diamond_subset$carat < 1] <- 0

# plot price of diamond by class label and carat intervals

# split carat values into 2 classes for classification
diamond_subset$classes <- cut(diamond_subset[["carat"]], 
  breaks = c(-Inf, 1, Inf), labels = c("carat_below_1", "carat_above_1")) 

# set intervals for continuous carat values, all intervals > 1 belong to class "carat_above_1"
break_points <- c(0,1,2,3,4,5.1)

ggplot(diamond_subset, aes(x = cut, y = price, colour = classes)) +  
  geom_point(aes(fill = cut(carat, break_points, include.lowest = TRUE)), pch = 21, alpha = 0.4, size = 2,
    position = position_jitterdodge(dodge.width = 0.7, jitter.width = 0.5)) +
  stat_summary(position=position_dodge(width = 0.9), fill = "black", 
    fun.y = mean, geom = "point", shape = 21, size = 4, stroke = 1.5, alpha = 1) +
  scale_fill_manual(values = c("white", "green", "yellow", "red", "black"),
    name = "Carat") +
  scale_colour_manual(
    values = c("blue", "red"),
    name = "Average price per class",
    breaks = c("carat_below_1", "carat_above_1"),
    labels = c("Carat < 1", "Carat > 1")
) 

额外问题:可能的解决方案是否也适用于 2 个以上的课程?例如

diamond_subset$classes <- cut(diamond_subset[["carat"]], 
  breaks = c(-Inf, 1, 2, Inf), labels = c("carat_below_1", "carat_above_1", "carat_above_2")) 

ggplot(diamond_subset, aes(x = cut, y = price, colour = classes)) +  
  geom_point(aes(fill = cut(carat, break_points, include.lowest = TRUE)), pch = 21, alpha = 0.4, size = 2,
    position = position_jitterdodge(dodge.width = 0.7, jitter.width = 0.5)) +
  stat_summary(position=position_dodge(width = 0.9), fill = "black", 
    fun.y = mean, geom = "point", shape = 21, size = 4, stroke = 1.5, alpha = 1) +
  scale_fill_manual(values = c("white", "green", "yellow", "red", "black"),
    name = "Carat") +
  scale_colour_manual(
    values = c("blue", "red", "black"),
    name = "Average price per class",
    breaks = c("carat_below_1", "carat_above_1", "carat_above_2"),
    labels = c("Carat < 1", "Carat > 1", "Carat > 2")
) 

编辑:

如果没有人对此有解决方案,我当然可以将图保存为 pdf 并手动将点居中?

标签: rggplot2

解决方案


推荐阅读