首页 > 解决方案 > ggplot2:为多列添加 p 值、Rsq 和斜率

问题描述

假设我有这个数据框:

library(ggplot2)
Y <- rnorm(100)
df <- data.frame(A = rnorm(100), B = runif(100), C = rlnorm(100),
                 Y = Y)
colNames <- names(df)[1:3]
for(i in colNames){
  plt <- ggplot(df, aes_string(x=i, y = Y)) +
    geom_point(color="#B20000", size=4, alpha=0.5) +
    geom_hline(yintercept=0, size=0.06, color="black") + 
    geom_smooth(method=lm, alpha=0.25, color="black", fill="black")
  print(plt)
  Sys.sleep(2)
}

我想做一个 lm 模型并为每一列显示调整后的 Rsq、截距、斜率和 p 值。我在下面找到了一个例子

data(iris)
ggplotRegression <- function (fit) {

require(ggplot2)

ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") +
  labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                     "Intercept =",signif(fit$coef[[1]],5 ),
                     " Slope =",signif(fit$coef[[2]], 5),
                     " P =",signif(summary(fit)$coef[2,4], 5)))
}

fit1 <- lm(Sepal.Length ~ Petal.Width, data = iris)
ggplotRegression(fit1)

但它只适用于一列。(我从这个问题中拿了例子)和这个在这里

谢谢!

标签: rggplot2lmp-value

解决方案


基于上面的评论,您可以将 fit 放入函数中,然后使用 循环lapply

library(ggplot2)

Y <- rnorm(100)
df <- data.frame(A = rnorm(100), B = runif(100), C = rlnorm(100),
                 Y = Y)
colNames <- names(df)[1:3]


plot_ls <- lapply(colNames, function(x){


  fit <- lm(Y ~ df[[x]], data = df)
  ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
    geom_point() +
    scale_x_continuous(x)+
    stat_smooth(method = "lm", col = "red") +
    ggtitle(paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                       "Intercept =",signif(fit$coef[[1]],5 ),
                       " Slope =",signif(fit$coef[[2]], 5),
                       " P =",signif(summary(fit)$coef[2,4], 5))
            )
})

gridExtra::grid.arrange(plot_ls[[1]],plot_ls[[2]],plot_ls[[3]])

在此处输入图像描述


推荐阅读