首页 > 解决方案 > xgboost:当数据是矩阵时需要标签

问题描述

我被困了几个小时试图用 R 运行 XGboost。我有一个包含大约 40 列的训练数据和测试数据,最后一列是目标列。它是 0,1 标称值。我正在运行从https://www.kaggle.com/michaelpawlus/xgboost-example-0-76178/code获得的这段代码。

require(xgboost)
library(xgboost)

train  <- read.csv(file.choose(),header = T)
test   <- read.csv(file.choose(),header = T)

feature.names <- names(train)[2:ncol(train)-1]

 clf <- xgboost(data        = data.matrix(train[,feature.names]),
               label       = train$target,
               nrounds     = 100, # 100 is better than 200
               objective   = "binary:logistic",
               eval_metric = "auc")

 cat("making predictions in batches due to 8GB memory limitation\n")
 submission <- data.frame(ID=test$ID)
 submission$target1 <- NA 
 for (rows in test) {
    submission[rows, "Succeed"] <- predict(clf, data.matrix(test[rows,feature.names]))
 }

 varimp_clf <- xgb.importance(feature_names=feature.names,model=clf)

 xgb.plot.importance(varimp_clf)

这是我得到的错误

xgb.get.DMatrix(数据,标签,缺失,权重)中的错误:xgboost:当数据是矩阵时需要标签

$<-.data.frame( *tmp*, target1, value = NA)中的错误:替换有 1 行,数据有 0

预测错误(clf,data.matrix(test [rows,feature.names])):找不到对象'clf'

标签: rmachine-learning

解决方案


检查您的输入数据。您的最后一列是否命名为目标?听起来好像不是。


推荐阅读