首页 > 解决方案 > `[<-`(`*tmp*`, f, k, value = mean(knn.pred != test.setY)) 中的错误:下标越界

问题描述

当我运行下面的代码时,我得到了这个错误。我该如何解决这个错误?

该数据集是 Caravan Insurance 数据集

 Caravan <- read.csv(file="~/Desktop/Caravan.csv")
 dim(Caravan)
 table(Purchase)]

 [1] 5822   86
 Purchase
 No  Yes 
 5474  348

我正在尝试使用 5 折交叉验证方法来确定仅使用训练数据的最佳 K 值。K 的候选值为 {3, 4, . . . , 10}。

我在做 K 折 = 5

standardized.X = scale(Caravan[, -86])
var(standardized.X[,1])
var(standardized.X[,2])
test = 1:1000
train.X=standardized.X[-test ,] 
test.X=standardized.X[test ,]
train.Y=Purchase[-test]
test.Y=Purchase[test]

#验证集方法

set.seed(2001835)
ntrain=nrow(train.X)
retrainData <- sample(ntrain, 0.7*ntrain)
train.setX <- train.X[retrainData,]
test.setX <- train.X[-retrainData,]
train.setY <- train.Y[retrainData]
test.setY <- train.Y[-retrainData]

F <- 5
set.seed(2001835)
folds <- cut(seq(1,ntrain), breaks = F, labels = FALSE)
folds <- folds[sample(ntrain)]
folds



k = 3:10
ErrCV <- matrix(0, nrow=F, ncol= k)
for (f in 1:F) {
 retrainData <- which(folds != f)
 train.setX <- train.X[retrainData,]
 test.setX <- train.X[-retrainData,]
 train.setY <- train.Y[retrainData]
 test.setY <- train.Y[-retrainData]

 for(k in 3:10){
   # Model fitting
   knn.pred <- knn(train = train.setX, test = test.setX, cl = train.setY, k = k)

   # Error
   ErrCV[f,k]=mean(knn.pred != test.setY)
 }
}
 CV <- apply(ErrCV, 2, mean)
 plot(CV,type="l")

当我运行代码时,它一直说它超出了范围

标签: r

解决方案


推荐阅读