首页 > 解决方案 > 在 Caret 的 train 函数中设置 preProcess 参数如何工作?

问题描述

我正在尝试预测训练神经网络的时间表。但是,我无法真正了解Caret中的preProcess参数是如何工作的。train

在文档中,它说:

preProcess 类可用于预测变量的许多操作,包括居中和缩放。

当我们preProcess像下面这样设置时,

tt.cv <- train(product ~ .,
               data = tt.train,
               method = 'neuralnet',
               tuneGrid = tune.grid,
               trControl = train.control,
               linear.output = TRUE,
               algorithm = 'backprop',
               preProcess = 'range',
               learningrate = 0.01)
  1. 在这种情况下,这是否意味着该train函数对传递的训练数据进行了预处理(规范化)tt.train
  2. 训练完成后,当我们尝试 时predict,我们是将归一化的输入传递给predict函数还是因为我们设置了preProcess参数而在函数中归一化了输入?
# Do we do
predict(tt.cv, tt.test)
# or
predict(tt.cv, tt.normalized.test)
  1. 从上面的引用看来,当我们使用 时preProcess,输出在训练中没有以这种方式标准化,我们如何对输出进行标准化呢?还是我们只是像下面这样预先对训练数据进行归一化,然后将其传递给train函数?
preProc <- preProcess(tt, method = 'range')
tt.preProcessed <- predict(preProc, tt)
tt.preProcessed.train <- tt.preProcessed[indexes,]
tt.preProcessed.test <- tt.preProcessed[-indexes,]

整个代码:

library(caret)
library(neuralnet)

# Create the dataset
tt = data.frame(multiplier = rep(1:10, times = 10), multiplicand = rep(1:10, each = 10))
tt = cbind(tt, data.frame(product = tt$multiplier * tt$multiplicand))

# Splitting 
indexes = createDataPartition(tt$product,
                              times = 1,
                              p = 0.7,
                              list = FALSE)
tt.train = tt[indexes,]
tt.test = tt[-indexes,]

# Pre-process

preProc <- preProcess(tt, method = c('center', 'scale'))
tt.preProcessed <- predict(preProc, tt)
tt.preProcessed.train <- tt.preProcessed[indexes,]
tt.preProcessed.test <- tt.preProcessed[-indexes,]

# Train

train.control <- trainControl(method = "repeatedcv",
                              number = 10,
                              repeats = 3,
                              savePredictions = TRUE)

tune.grid <- expand.grid(layer1 = 8,
                         layer2 = 0,
                         layer3 = 0)

tt.cv <- train(product ~ .,
               data = tt.train,
               method = 'neuralnet',
               tuneGrid = tune.grid,
               trControl = train.control,
               algorithm = 'backprop',
               learningrate = 0.01,
               stepmax = 100000,
               preProcess = c('center', 'scale'),
               lifesign = 'minimal',
               threshold = 0.01)

标签: rmachine-learningr-caret

解决方案


推荐阅读