首页 > 解决方案 > Logistic 回归模型未出现在 Plot() 上 - 似乎是 lines() 问题

问题描述

我正在尝试创建一个图表,表示二进制数据(临床症状)与连续预测变量(日志拷贝数)的逻辑回归。我可以使用 glm() 生成模型没问题,但我在使用 lines() 函数实际绘制回归表示时遇到了问题。这是我的数据的样子。

    df.min <- structure(list(clinical.signs = structure(c(1L, 1L, 1L, 1L, 1L, 
                                                          2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 
                                                          1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 
                                                          1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 
                                                          2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 
                                                          1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 
                                                          1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 
                                                          2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("0", "1"), class = "factor"), 
                             log.copy.num = c(0, 5.43372200355424, 0, 0, 0, 0, 0, 4.18965474202643, 
                                              3.42751468997953, 0, 0, 0, 0, 0, 0.824175442966349, 0, 0, 
                                              0, 0, 0, 2.97552956623647, 1.91692261218206, 1.43270073393405, 
                                              0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.13179677201376, 0, 
                                              0, 0, 3.53805656437935, 0, 0, 0, 0, 0, 0, 0, 4.26127043353808, 
                                              2.54160199346455, 1.15057202759882, 4.88280192258637, 0, 
                                              0, 0, 0, 0, 3.62434093297637, 0, 0, 0, 0, 0, 0, 3.45946628978613, 
                                              0, 0, 0, 7.40913644392013, 0, 0, 0, 0, 0, 0, 0, 3.35689712276558, 
                                              0, 0, 0, 0, 4.25518708733893, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                                              3.15700042115011, 0, 2.07317192866624, 0, 7.85979918056211, 
                                              3.16124671203156, 0, 2.20386912005489, 5.04985600724954, 
                                              0, 1.45395300959371, 0, 3.28091121578765, 3.83945231259331, 
                                              2.54160199346455, 2.66722820658195, 2.2512917986065, 7.53955882930103, 
                                              6.30261897574491, 6.96696713861398)), class = c("tbl_df", 
                                                                                              "tbl", "data.frame"), row.names = c(NA, -110L)

)

和我的剧本

#logistic regression using glm 
logimodel <- glm(clinical.signs ~ log.copy.num, data = df.min, family = "binomial")
summary(logimodel)

#plot the logisitc regression above 
xaxis <- seq(min(df.min$log.copy.num), max(df.min$log.copy.num), 0.1)
yaxis <- predict(logimodel, list(log.copy.num=xaxis), type = "response")
plot(xaxis, yaxis)
plot(df.min$log.copy.num, df.min$clinical.signs)
lines(xaxis,yaxis, col = "blue")

感谢您对我确信是愚蠢的疏忽的任何指导!

标签: rplotlinelogistic-regressionglm

解决方案


您有临床症状作为因素:

class(df.min$clinical.signs)
[1] "factor"

因此,当您绘制它时,它们将转换为 1s 和 2s,而您的 yaxis 在 0-1 范围内(因为您有可能是“1”)。要使其具有相同的规模,请执行

plot(df.min$log.copy.num, as.numeric(df.min$clinical.signs)-1,
ylab="clinical signs",xlab="log.copy.num")
lines(xaxis,yaxis, col = "blue")


推荐阅读