首页 > 解决方案 > 从适合 purrr 的模型中提取残差

问题描述

我将我的数据分组并为每个组拟合一个模型,我希望每个组都有残差。我可以使用 RStudio 的查看器查看每个模型的残差,但我不知道如何提取它们。提取一组残差可以像这样完成diamond_mods[[3]][[1]][["residuals"]],但是我如何使用 purrr 从每个组中提取一组(连同扫帚一起以得到一个漂亮的小标题)?

以下是我已经走了多远:

library(tidyverse)
library(purrr)
library(broom)


fit_mod <- function(df) {
  lm(price ~ poly(carat, 2, raw = TRUE), data = df)
}

diamond_mods <- diamonds %>%
  group_by(cut) %>%
  nest() %>%
  mutate(
    model = map(data, fit_mod),
    tidied = map(model, tidy)
    #resid = map_dbl(model, "residuals") #this was my best try, it doesn't work
  ) %>%
  unnest(tidied) 

标签: rdplyrpurrrbroom

解决方案


使用 的devel版本dplyr,我们可以在condense按 'cut' 分组后执行此操作

library(dplyr)
library(ggplot2)
library(broom)
diamonds %>%
   group_by(cut) %>%
   condense(model = fit_mod(cur_data()),
            tidied = tidy(model), 
            resid = model[["residuals"]])
# A tibble: 5 x 4
# Rowwise:  cut
#  cut       model  tidied           resid         
#  <ord>     <list> <list>           <list>        
#1 Fair      <lm>   <tibble [3 × 5]> <dbl [1,610]> 
#2 Good      <lm>   <tibble [3 × 5]> <dbl [4,906]> 
#3 Very Good <lm>   <tibble [3 × 5]> <dbl [12,082]>
#4 Premium   <lm>   <tibble [3 × 5]> <dbl [13,791]>
#5 Ideal     <lm>   <tibble [3 × 5]> <dbl [21,551]>

推荐阅读