r - 如何在 ggplot2 中绘制自举斜率的向量?
问题描述
我一直在使用 ggplot2 绘制引导各种统计输出(例如相关系数)的结果。最近,我引导了线性回归模型的斜率。plot()
以下是使用包中的函数的外观graphics
:
plot(main="Relationship Between Eruption Length at Wait Time at \n
Old Faithful With Bootstrapped Regression Lines",
xlab = "Eruption Length (minutes)",
ylab = "Wait Time (minutes)",
waiting ~ eruptions,
data = faithful,
col = spot_color,
pch = 19)
index <- 1:nrow(faithful)
for (i in 1:10000) {
index_boot <- sample(index, replace = TRUE) #getting a boostrap sample (of indices)
faithful_boot <- faithful[index_boot, ]
# Fitting the linear model to the bootstrapped data:
fit.boot <- lm(waiting ~ eruptions, data = faithful_boot)
abline(fit.boot, lwd = 0.1, col = rgb(0, 0.1, 0.25, alpha = 0.05)) # Add line to plot
}
fit <- lm(waiting ~ eruptions, data=faithful)
abline(fit, lwd = 2.5, col = "blue")
这行得通,但这取决于我们首先创建绘图,然后在循环中添加线条的工作流程。我宁愿用函数创建一个斜率列表,然后在 ggplot2 中绘制所有斜率。
例如,该函数可能如下所示:
set.seed(777) # included so the following output is reproducible
n_resample <- 10000 # set the number of times to resample the data
# First argument is the data; second is the number of resampled datasets
bootstrap <- function(df, n_resample) {
slope_resample <- matrix(NA, nrow = n_resample) # initialize vector
index <- 1:nrow(df) # create an index for supplied table
for (i in 1:n_resample) {
index_boot <- sample(index, replace = TRUE) # sample row numbers, with replacement
df_boot <- df[index_boot, ] # create a bootstrap sample from original data
a <- lm(waiting ~ eruptions, data=df_boot) # compute linear model
slope_resample[i] <- slope <- a$coefficients[2] # take the slope
}
return(slope_resample) # Return a vector of differences of proportion
}
bootstrapped_slopes <- bootstrap(faithful, 10000)
但是如何获取geom_line()
或geom_smooth()
获取数据bootstrapped_slopes
呢?非常感谢任何帮助。
解决方案
编辑:从 OP 更直接的改编
对于绘图,我假设您需要斜率和截距,所以这是一个修改后的bootstrap
函数:
bootstrap <- function(df, n_resample) {
# Note 2 dimensions here, for slope and intercept
slope_resample <- matrix(NA, 2, nrow = n_resample) # initialize vector
index <- 1:nrow(df) # create an index for supplied table
for (i in 1:n_resample) {
index_boot <- sample(index, replace = TRUE) # sample row numbers, with replacement
df_boot <- df[index_boot, ] # create a bootstrap sample from original data
a <- lm(waiting ~ eruptions, data=df_boot) # compute linear model
slope_resample[i, 1] <- slope <- a$coefficients[1] # take the slope
slope_resample[i, 2] <- intercept <- a$coefficients[2] # take the intercept
}
# Return a data frame with all the slopes and intercepts
return(as.data.frame(slope_resample))
}
然后运行它并从该数据框中绘制线条:
bootstrapped_slopes <- bootstrap(faithful, 10000)
library(dplyr); library(ggplot2)
ggplot(faithful, aes(eruptions, waiting)) +
geom_abline(data = bootstrapped_slopes %>%
sample_n(1000), # 10k lines look about the same as 1k, just darker and slower
aes(slope = V2, intercept = V1), #, group = id),
alpha = 0.01) +
geom_point(shape = 19, color = "red")
替代解决方案
这也可以使用modelr
和broom
简化一些引导来完成。基于 的主要帮助示例modelr::bootstrap
,我们可以执行以下操作:
library(purrr); library(modelr); library(broom); library(dplyr)
set.seed(777)
# Creates bootstrap object with 10k extracts from faithful
boot <- modelr::bootstrap(faithful, 10000)
# Applies the linear regression to each
models <- map(boot$strap, ~ lm(waiting ~ eruptions, data = .))
# Extracts the model results into a tidy format
tidied <- map_df(models, broom::tidy, .id = "id")
# We just need the slope and intercept here
tidied_wide <- tidied %>% select(id, term, estimate) %>% spread(term, estimate)
ggplot(faithful, aes(eruptions, waiting)) +
geom_abline(data = tidied_wide %>%
sample_n(1000), # 10k lines look about the same as 1k, just darker and slower
aes(slope = eruptions, intercept = `(Intercept)`, group = id), alpha = 0.05) +
geom_point(shape = 19, color = "red") # spot_color wasn't provided in OP
推荐阅读
- javascript - 在扩展程序中访问网页的 Javascript 代码内容?
- mysql - 面临有关创建 mysql 事件的问题
- angular - PrimeNG 表格对每一列的过滤和排序不起作用
- php - php+symfony 5.0.x 扩展抽象类和同时实现接口时自动加载问题
- python - 我如何调试 pindown python 脚本?
- python - Pyspark 多处理方法或代码优化
- python - django模型-删除属性后数据无效
- c# - Swagger UI AuthHeader 过滤器 - 授权过滤器不在 FilterDescriptors 中
- wordpress - Woocommerce 产品详细信息页面删除添加到购物车并放置电子邮件按钮
- html - Powershell“InternetExplorer.Application”点击下拉菜单