首页 > 解决方案 > 无法在简单逻辑回归上绘制 p 值

问题描述

我正在尝试在 R 中绘制一个简单的逻辑回归。

我正在按照本教程进行逻辑回归并计算 P 值(https://mgimond.github.io/Stats-in-R/Logistic.html)。我正在尝试使用 ggplot2 和 ggpmisc 来绘制回归图。我一直在尝试使用本指南(http://cran.nexr.com/web/packages/ggpmisc/vignettes/user-guide-1.html#stat_fit_glance)来显示 p 值

require(cowplot)
require(ggplot2)
library(ggpmisc)
library(rms)

dataset=read.table('input.txt', header=TRUE)
model <- glm(variable ~ ancestry, data=dataset, family=binomial)
summary(model)
#plot logistic regression curve
plot <- ggplot(dataset, aes(x=ancestry, y=variable)) + 
  geom_point(alpha=.5, color=dataset$colorsite) +
  stat_smooth(method="glm", se=FALSE, method.args = list(family=binomial)) + stat_fit_glance(method = "glm", method.args = list(formula = formula), geom = "text", aes(label = paste("P-value = ", signif(..p.value.., digits = 4), sep = ""))) 
ggsave("output.pdf")

然而,输出结果为

> source("C:/Users/Deven/Desktop/logistic/script.R")
Saving 7 x 7 in image
`geom_smooth()` using formula 'y ~ x'
Warning message:
Computation failed in `stat_fit_glance()`:
object of type 'closure' is not subsettable

我也尝试过来自 ggpubr 的 stat_cor,但这似乎从我计算的结果中生成了不同的 p 值和 R^2 值。

基于评论的更新: + stat_poly_eq(formula = y ~ x, method="glm", aes(x = ancestry, y = variable, label = paste(..p.value.label..,sep = "~~~~")),parse = TRUE)由于

1: Computation failed in `stat_poly_eq()`:
Method 'glm' not yet implemented. 

如果我删除方法,它默认为线性回归(并给出不对应于逻辑回归的 p 值)。

第二次更新

model <- glm(variable ~ ancestry, data=dataset, family=binomial)
summary(model)
#plot logistic regression curve
plot <- ggplot(dataset, aes(x=ancestry, y=variable)) +
  geom_point(alpha=.5, color=dataset$colorsite) +
  stat_smooth(method="glm", se=FALSE, method.args = list(family=binomial)) + stat_fit_tidy(method = "glm",method.args = list(family=binomial,formula=y~x), mapping = aes(label = sprintf("Coef = %.3g\np-value = %.3g",after_stat(x_estimate),after_stat(x_p.value))))
ggsave("variable.pdf")

产生以下错误:

Saving 7 x 7 in image
`geom_smooth()` using formula 'y ~ x'
Warning message:
Computation failed in `stat_fit_tidy()`:
no applicable method for 'tidy' applied to an object of class "c('glm', 'lm')" 

另一个更新

library(ggplot2)
library(ggpmisc)

da =read.table('data.txt', header=TRUE)


model = glm(variable ~ ancestry,family=binomial,data=da)
summary(model)

ggplot(da,aes(x = ancestry,y = variable)) + geom_point() +
  stat_smooth(method="glm",se=FALSE,method.args = list(family=binomial)) +
  stat_fit_tidy(method = "glm",method.args = list(family=binomial,formula=y~x),
                mapping = aes(label = sprintf("Coef = %.3g\np-value = %.3g",
                                              after_stat(x_estimate),after_stat(x_p.value))))
ggsave("test.pdf")

理论上可行,但它给我的 p 值与我手动计算的 p 值(对应于来自的 p 值)非常不同lrm(variable ~ ancestry, dataset)......

完全不知道这里发生了什么......

标签: rggplot2

解决方案


ggpmisc帮助页面上有一个表格,指定了可以应用于每种类型的模型的内容。

在此处输入图像描述

你有一个 glm,所以glance()fromtidy不会给你一个 p 值。使用示例:

library(ggplot2)
library(ggpmisc)

da = MASS::Pima.tr
da$label = as.numeric(da$type=="Yes")

model = glm(label ~ bmi,family=binomial,data=da)
summary(model)

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -4.11156    0.92806  -4.430 9.41e-06 ***
bmi          0.10482    0.02738   3.829 0.000129 ***

你可以看到一瞥不会给你一个 p 值:

broom::glance(model)
# A tibble: 1 x 8
  null.deviance df.null logLik   AIC   BIC deviance df.residual  nobs
          <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int> <int>
1          256.     199  -120.  244.  251.     240.         198   200

您需要使用tidy()和评论中提到的@JonSpring,提供公式,所以是这样的:

ggplot(da,aes(x = bmi,y = label)) + geom_point() +
stat_smooth(method="glm",se=FALSE,method.args = list(family=binomial)) +
stat_fit_tidy(method = "glm",method.args = list(family=binomial,formula=y~x),
mapping = aes(label = sprintf("Coef = %.3g\np-value = %.3g",
after_stat(x_estimate),after_stat(x_p.value))))

在此处输入图像描述


推荐阅读