首页 > 解决方案 > 在 R 中使用配方后如何对变量进行非规范化(反向转换)?

问题描述

我正在训练一个neuralnetusingtrain函数并使用recipes.

是否有任何功能可以从模型中进行预测,然后在原始范围内重新缩放它们,[1, 100]在我的情况下?

library(caret)
library(recipes)
library(neuralnet)

# Create the dataset - times table 
tt <- data.frame(multiplier = rep(1:10, times = 10), multiplicand = rep(1:10, each = 10))
tt <- cbind(tt, data.frame(product = tt$multiplier * tt$multiplicand))

# Splitting 
indexes <- createDataPartition(tt$product,
                              times = 1,
                              p = 0.7,
                              list = FALSE)
tt.train <- tt[indexes,]
tt.test <- tt[-indexes,]

# Recipe to pre-process our data
rec_reg <- recipe(product ~ ., data = tt.train) %>%
  step_center(all_predictors()) %>% step_scale(all_outcomes()) %>%
  step_center(all_outcomes()) %>% step_scale(all_predictors())

# Train
train.control <- trainControl(method = "repeatedcv",
                              number = 10,
                              repeats = 3,
                              savePredictions = TRUE)

tune.grid <- expand.grid(layer1 = 8,
                         layer2 = 0,
                         layer3 = 0)

# Setting seed for reproducibility
set.seed(12)
tt.cv <- train(rec_reg,
               data = tt.train,
               method = 'neuralnet',
               tuneGrid = tune.grid,
               trControl = train.control,
               algorithm = 'backprop',
               learningrate = 0.005,
               lifesign = 'minimal')

标签: rmachine-learningr-caret

解决方案


如果您使用step_normalize而不是step_scaleand step_center,则可以使用以下函数基于 a 进行“非标准化” recipe。(如果您更喜欢两个步骤来规范化,则需要调整unnormalize函数。)

该函数用于提取相关步骤。

#' Extract step item
#'
#' Returns extracted step item from prepped recipe.
#'
#' @param recipe Prepped recipe object.
#' @param step Step from prepped recipe.
#' @param item Item from prepped recipe.
#' @param enframe Should the step item be enframed?
#'
#' @export
extract_step_item <- function(recipe, step, item, enframe = TRUE) {
  d <- recipe$steps[[which(purrr::map_chr(recipe$steps, ~ class(.)[1]) == step)]][[item]]
  if (enframe) {
    tibble::enframe(d) %>% tidyr::spread(key = 1, value = 2)
  } else {
    d
  }
}

该函数用于非规范化。所以它乘以标准。偏差并添加平均值。

#' Unnormalize variable
#'
#' Unormalizes variable using standard deviation and mean from a recipe object. See \code{?recipes}.
#'
#' @param x Numeric vector to normalize.
#' @param rec Recipe object.
#' @param var Variable name in the recipe object.
#'
#' @export
unnormalize <- function(x, rec, var) {
  var_sd <- extract_step_item(rec, "step_normalize", "sds") %>% dplyr::pull(var)
  var_mean <- extract_step_item(rec, "step_normalize", "means") %>% dplyr::pull(var)

  (x * var_sd) + var_mean
}

所以你应该能够生成预测然后使用:

unnormalize(predictions, prepped_recipe_obj, outcome_var_name)

wherepredictions是从训练模型生成的预测向量,prepped_recipe_objrec_reg您的情况下,outcome_var_nameproduct您的情况下。


推荐阅读