r - 当我尝试使用随机森林模型进行预测时,为什么会出现错误?
问题描述
我正在尝试使用随机森林模型进行预测
我的数据如下所示:
> str(margins_data)
'data.frame': 457961 obs. of 10 variables:
$ month : Factor w/ 7 levels "April","August",..: 6 6 4 6 2 1 5 6
5 4 ...
$ miles : num 416 1559 1156 672 1188 ...
$ equipment : Factor w/ 3 levels "Flat","Reefer",..: 1 3 3 3 3 2 3 2 3
3 ...
$ originstate : Factor w/ 62 levels " ","AB","AL",..: 20 55 14 34 14 56
14 34 57 14 ...
$ destinationstate: Factor w/ 62 levels "AB","AK","AL",..: 17 7 55 27 55 8
55 32 46 12 ...
$ margin : num 800 450 450 200 450 700 500 375 200 200 ...
$ ldi : num 2.5 4.84 3.1 1.75 3.35 ...
$ weight : int 40000 43000 40000 10000 39000 35000 39000 7817
38000 42720 ...
$ commoditygroup : Factor w/ 49 levels "Agriculture",..: 18 9 18 15 42 38
18 22 27 18 ...
$ customerindustry: Factor w/ 352 levels "Abrasive, Asbestos, And
Miscellaneous",..: 300 336 336 229 336 133 336 133 260 264 ...
- attr(*, "na.action")= 'omit' Named int 1182 2282 2869 2999 3082 4609 5360
5444 5445 6029 ...
..- attr(*, "names")= chr "1182" "2282" "2869" "2999" ...
我将数据分成训练集和测试集:
N <- nrow(margins_data)
target <- round(N * 0.75)
gp <- runif(N)
margin_train <- margins_data[gp < 0.75, ]
margin_test <- margins_data[gp >= 0.75, ]
并定义了我的模型参数:
seed <- 423563
outcome <- "margin"
vars <- c("miles", "equipment", "originstate", "destinationstate", "margin",
"ldi", "weight", "commoditygroup", "customerindustry")
fmla <- paste(outcome, "~", paste(vars, collapse = " + "))
margin_model_rf <- ranger(fmla,
margin_train,
num.trees = 500,
respect.unordered.factors = "order",
seed = seed)
margin_model_rf
Call:
ranger(fmla, margin_train, num.trees = 500, respect.unordered.factors =
"order", seed = seed)
Type: Regression
Number of trees: 500
Sample size: 343253
Number of independent variables: 9
Mtry: 3
Target node size: 5
Variable importance mode: none
Splitrule: variance
OOB prediction error (MSE): 840.8202
当我尝试预测测试数据时,出现以下错误:
margin_predict <- predict(margin_model_rf, margin_test)
Error: Missing data in columns: weight.
In addition: Warning message:
In mapply(function(x, y) { :
longer argument not a multiple of length of shorter
对此的任何帮助将不胜感激。
解决方案
推荐阅读
- python - sqlite3.OperationalError)没有这样的表 - 带有 Huey 任务队列的烧瓶
- bash - 在解析操作系统路径的 bash 脚本中动态分配变量
- sql - 将查询结果导出到 Json
- sql - 如何根据特殊字符从行中提取值
- java - 如何在spring boot中反序列化数组对象
- javascript - 创建依赖的 extjs 网格
- python - 将 json 写入 MYSQL 数据库时 SQL 语法中的 PYTHON 错误
- php - 在wordpress中自动完成数据库中的输入字段
- r - 在 R 中使用 ggplot2 绘制 HUC 流域边界
- javascript - 如何正确使用从父组件继承的数据?