首页 > 解决方案 > 在 ggduo 散点图矩阵中显示相关指数(带着色)

问题描述

基于这篇文章,现在我有

library (GGally)

# from help
PointsWithCor <- function(data, mapping, ..., method = "pearson") {
  x <- eval(mapping$x, data)
  y <- eval(mapping$y, data)
  cor <- cor(x, y, method = method)
  ggally_points(data, mapping, ...) +
    ggplot2::geom_label(
      data = data.frame(
        x = min(x, na.rm = TRUE),
        y = max(y, na.rm = TRUE),
        lab = round(cor, digits = 3)
      ),
      mapping = ggplot2::aes(x = x, y = y, label = lab),
      hjust = 0, vjust = 1,
      size = 5, fontface = "bold",
      inherit.aes = FALSE # do not inherit anything from the ...
    )
}

# data frame
df = data.frame(runif(100),
                rnorm(100),
                rgamma(100,1,2),
                rt(100,1),
                rf(100,1,2),
                as.factor(round(runif(100,0,1))))
colnames(df) = c("a","b","c","d","e","f")

# points + cor, but only one cor index
ggduo(df,columnsX = 1:2, columnsY = 3:5,
      mapping = aes(colour = f),
      types = list(continuous = PointsWithCor))

但它会产生一个散点图矩阵,在所有 x 和所有 y 中都具有相关性。我想以与散点图中的点着色相同的方式显示相关性。

我认为它需要修改函数以colour在映射中使用属性,但不知道该怎么做。有人可以给我一个建议吗?

在此处输入图像描述

编辑:
要在@aosmith 的答案中对齐图像中的相关标签,

# from help but modified
PointsWithCor <- function(data, mapping, ..., method = "pearson") {
  df <- data.frame(x = eval(mapping$x, data), y = eval(mapping$y, data), c = eval(mapping$colour, data))

  xPos = min(df$x)
  yPos = max(df$y)

  sumdf <- df %>%
    group_by(c) %>%
    summarise(
      lab = round(cor(x, y),3),
      x = xPos,
      y = yPos*min(as.numeric(c))/max(as.numeric(df$c))
    )

  ggally_points(data, mapping, ...) +
    ggplot2::geom_label(
      data = sumdf,
      mapping = ggplot2::aes(x = x, y = y, label = lab, color = c),
      hjust = 0, vjust = 1,
      size = 5, fontface = "bold",
      inherit.aes = FALSE # do not inherit anything from the ...
    )
}

标签: rggplot2

解决方案


这是一种方法。我发现关键是按组估计标签值和轴位置。我使用dplyr的辅助函数进行分组和汇总。

否则,这与您所做的类似,使用mapping情节中的。我将映射 ( x, y, colour) 存储在 data.frame 中,以便进行汇总。

您可能需要处理轴位置放置。您会看到 min x 和 max y 并不真正适用于所有这些。您可能决定以不同的方式计算它们。

这是我制作的功能:

library(GGally)
library(dplyr)

points_with_cor_color = function(data, mapping, ..., method = "pearson") {
     dat = data.frame(x = data[, as.character(mapping$x)],
                      y = data[, as.character(mapping$y)],
                      color = data[, as.character(mapping$colour)])

     sumdat = dat %>%
          group_by(color) %>%
          summarise(lab = round(cor(x, y, method = method), 3),
                    x = min(x, na.rm = TRUE), 
                    y = max(y, na.rm = TRUE) )

     ggally_points(data, mapping, ...) +
          ggplot2::geom_label(
               data = sumdat,
               mapping = ggplot2::aes(x = x, y = y, label = lab, color = color),
               hjust = 0, vjust = 1,
               size = 5, fontface = "bold", # do not inherit anything from the ...
               inherit.aes = FALSE
          )
}

这是带有ggduo().

ggduo(df,columnsX = 1:2, columnsY = 3:5,
      mapping = ggplot2::aes(color = f),
     types = list(continuous = points_with_cor_color))

在此处输入图像描述


推荐阅读