首页 > 解决方案 > 许多模型分组 modelr::add_predictions

问题描述

我想使用标记为的数据train来拟合模型,然后使用标记为的数据test来预测新值。我想在“许多模型”场景中执行此操作。

以下是我目前的设置。我的问题是我正在训练并向所有数据添加预测。我不知道如何区分使用modelr

library(modelr)
library(tidyverse)
library(gapminder)

# nest data by continent and label test/train data
nested_gap <- gapminder %>% 
  mutate(test_train = ifelse(year < 1992, "train", "test")) %>% 
  group_by(continent) %>% 
  nest()

# make a linear model function
cont_model <- function(df) {
  lm(lifeExp ~ year, data = df)
}

# fit a model and add predictions to all data
fitted_gap <- nested_gap %>% 
  mutate(model = map(data, cont_model)) %>% 
  mutate(pred  = map2(data, model, add_predictions))

标签: rtidyverse

解决方案


这是@shuckle 提供的解决方案

library(modelr)
library(tidyverse)
library(gapminder)

# nest data by continent and label test/train data
nested_gap <- gapminder %>% 
  mutate(test_train = ifelse(year < 1992, "train", "test")) %>% 
  group_by(continent) %>% 
  nest()

# make a linear model function than only trains on training set
cont_model <- function(df) {
  lm(lifeExp ~ year, data = df %>% filter(test_train == "train"))
}

# fit a model and add predictions to all data
fitted_gap <- nested_gap %>% 
  mutate(model = map(data, cont_model)) %>% 
  mutate(pred  = map2(data, model, add_predictions))

# unnest predictions and filter only the test rows
fitted_gap %>% 
  unnest(pred) %>% 
  filter(test_train == "test")

推荐阅读