r - 如何在 R 中修复此交叉验证错误
问题描述
我正在对 R 中的训练数据集进行交叉验证。我使用随机森林进行了验证,现在我正在使用决策树,当我运行它时,它给了我一个错误。我使用 10 和 3 折对随机森林进行了交叉验证。我正在网上学习一堂使用 R 学习数据科学的课程,但我遇到了这个我一直试图弄清楚几个小时的难题。代码是:
#cross validation
library(caret)
library(doSNOW)
set.seed(2348)
cv.10.folds <- createMultiFolds(rf.label, k=10, times = 10)
#check stratification
table(rf.label)
342 / 549
#set up caret's trainControl object per above
ctrl.1 <- trainControl(method = "repeatedcv", number = 10, repeats = 10, index = cv.10.folds)
table(rf.label[cv.10.folds[[33]]])
#set up caret's traincontrol object per above
ctrl.1 <- trainControl(method = "repeatedcv", number = 10, repeats = 10, index = cv.10.folds)
#Set up doSNOW package for multi-core training. This is helpful as we're going
#to be training a lot of trees
cl <- makeCluster(6, types = "SOCK")
registerDoSNOW(c1)
#Set seed for reproducibility and train
set.seed(32384)
rf.4.cv.1 <- train(x = rf.train.4, y = rf.label, method = "rf", tunelength = 3,
ntree = 1000, trControl = ctrl.1)
#Shutdown cluster
stopCluster(cl)
#check out results
rf.4.cv.1
#rework with 3 folds
set.seed(37596)
cv.3.folds <- createMultiFolds(rf.label, k=3, times = 10)
#set up caret's trainControl object per above
ctrl.3 <- trainControl(method = "repeatedcv", number = 3, repeats = 10, index = cv.3.folds)
#set up caret's traincontrol object per above
ctrl.3 <- trainControl(method = "repeatedcv", number = 3, repeats = 10,
index = cv.3.folds)
#Set up doSNOW package for multi-core training. This is helpful as we're going
#to be training a lot of trees
cl <- makeCluster(6, types = "SOCK")
registerDoSNOW(c1)
#Set seed for reproducibility and train
set.seed(94622)
rf.3.cv.1 <- train(x = rf.train.3, y = rf.label, method = "rf", tunelength = 3,
ntree = 1000, trControl = ctrl.3)
#Shutdown cluster
stopCluster(cl)
#check out results
rf.3.cv.1
# Using single Decision tree to better understand what's going on with the features
library(rpart)
library(rpart.plot)
#Using 3 fold cross validation repeated 10 times
#create utility function
rpart.cv <- function(seed, training, labels, ctrl) {
cl <- makeCluster(6, type = "SOCK")
registerDoSNOW(cl)
set.seed(seed)
#Leverage formula interface for training
rpart.cv <- train(x = training, y = labels, method = "rpart", tunelength =30,
trControl = ctrl)
#Shutdown cluster
stopCluster(cl)
return (rpart.cv)
}
#Grab features
features <- c("Pclass", "title", "family.size")
rpart.train.1 <- data.combined[1:891, features]
#Run cross validation and check out results
rpart.1.cv.1 <- rpart.cv(94622, rpart.train.1, rf.label, ctrl.3)
rpart.1.cv.1
#Plot
prp(rpart.1.cv.1$finalModel, type = 0, extra =1, under = TRUE)
当我运行它时,我收到了错误消息:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :3 NA's :3
Error: Stopping
In addition: Warning message:
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
Show Traceback
Rerun with Debug
Error: Stopping > rpart.1.cv.1
Error: object 'rpart.1.cv.1' not found
解决方案
我能够通过以下方式解决它:
method = "class", parms = list(split = "Gini"), data =data.combined, control = rpart.control(cp)= .2, minsplit =5, minibucket = 5, maxdepth =10)
rpart.cv <- rpart(Survived~ Pclass + title + family.size,
data = data.combined, method = "class")
rpart.plot(rpart.cv, cex =.5, extra =4)
``
推荐阅读
- javascript - 在 javascript 模块之外动态调用函数
- python - 导入所有 Sympy 函数时如何使 lambda 成为符号?
- python - 导入stackapi模块识别
- javascript - REST API 删除请求和更新分页
- cmake - 未触发 cmake 依赖项
- react-native - ITMS-90809 已弃用的 API 使用 (UIWebView)
- ruby-on-rails - 如何在 Rails 的目录结构中查找文件
- angular - Angular 8:服务器正在获取垃圾而不是文件数据
- java - PKIX 路径构建失败:SunCertPathBuilderException:无法找到请求目标的有效证书路径
- ssl - kubernetes + 入口控制器 + 让我们加密 + 阻止混合内容