首页 > 解决方案 > 为什么我的 ggplot 点对点而不是绘制一条回归线?

问题描述

我想要绘制一条回归线,显示草食性损害 (%) 作为与交错带距离的函数。但正如您所看到的,它将所有数据点连接在一起。

这是情节: 阴谋

这是代码:

#plotting herbivory as a function of distance from the ecotone
# Get fitted values for our model 
model_fit <- predict(object = herb.mod, se.fit = T)
# Add these predictions to our original data frame, in a column called fit
leaf.data$fit <- model_fit$fit
# We can then work out the upper and lower bounds of our confidence intervals, adding them to separate columns
leaf.data$upper <- model_fit$fit  + 2 * model_fit$se
leaf.data$lower <- model_fit$fit  - 2 * model_fit$se

ggplot(data=leaf.data)+
  geom_point(aes(x = distance.from.ecotone, y = mean.herbivory, col=transect))+
  # add a line for model fit
  geom_line(aes(x = distance.from.ecotone, y = mean.herbivory), size=1.0)+
  # add a ribbon showing the CIs
  geom_ribbon(aes(x = distance.from.ecotone, ymin = lower, ymax = upper), alpha=0.25)+
  # add a title 
  ggtitle("Herbivorous Damage as a Function of Distance from an Ecotone")+
  theme_light()

标签: rggplot2regression

解决方案


您想通过均值而不是单个点画一条线,因此您可以使用 geom_smooth(),通过预测的平均值绘制:

set.seed(111)
leaf.data = data.frame(distance.from.ecotone=rep(seq(5,22.5,by=2.5),each=5))
leaf.data$mean.herbivory = -3*leaf.data$distance.from.ecotone + rnorm(nrow(leaf.data),0,3) + 80
leaf.data$transect = rep(c("One","Two"),each=5,times=4)

herb.mod = lm(mean.herbivory~distance.from.ecotone,data=leaf.data)
model_fit <- predict(object = herb.mod, se.fit = T)
leaf.data$fit <- model_fit$fit
leaf.data$upper <- model_fit$fit  + 2 * model_fit$se
leaf.data$lower <- model_fit$fit  - 2 * model_fit$se

这将起作用:

ggplot(data=leaf.data)+
geom_point(aes(x = distance.from.ecotone, y = mean.herbivory, col=transect))+
geom_line(aes(x = distance.from.ecotone, y = fit), size=1.0)+
geom_ribbon(aes(x = distance.from.ecotone, ymin = lower, ymax = upper), alpha=0.25)

在此处输入图像描述

请注意,您的置信区间也是重复的:

tail(leaf.data)
   distance.from.ecotone mean.herbivory transect      fit    upper     lower
35                  20.0      13.202012      One 19.05008 20.68734 17.412828
36                  22.5      15.988981      Two 11.55215 13.57185  9.532456
37                  22.5      12.151535      Two 11.55215 13.57185  9.532456
38                  22.5      13.502768      Two 11.55215 13.57185  9.532456
39                  22.5      10.637426      Two 11.55215 13.57185  9.532456
40                  22.5       8.570465      Two 11.55215 13.57185  9.532456

为预测创建一个单独的 data.frame 可能更有意义,例如:

pred = data.frame(distance.from.ecotone = 5:23)
model_fit <- predict(herb.mod, pred,se.fit = T)
pred$fit <- model_fit$fit
pred$upper <- model_fit$fit  + 2 * model_fit$se
pred$lower <- model_fit$fit  - 2 * model_fit$se

ggplot(data=leaf.data)+
geom_point(aes(x = distance.from.ecotone, y = mean.herbivory, col=transect))+
geom_line(data=pred,aes(x = distance.from.ecotone, y = fit), size=1.0)+
geom_ribbon(data=pred,aes(x = distance.from.ecotone, ymin = lower, ymax = upper), alpha=0.25)

推荐阅读