首页 > 解决方案 > 为什么在尝试在决策树中创建混淆矩阵时出现错误?

问题描述

我正在学习如何在 r 中使用决策树。

我做了一个模型,做了一个预测。我想检查我的模型的准确性。但是,当我尝试使用表函数制作混淆矩阵时,我得到了错误:

表中的错误(test_data$Outcome,predictn):所有参数必须具有相同的长度

我使用的代码是:

data =  read.csv("C:/Users/VIJAY/Desktop/ML/logistic regression/diabetes.csv")

head(data)
dim(data)


library(rpart)
library(rpart.plot)
library(caret)

s = sample(768,600)

train_data = data[s,]
test_data = data[-s,]

model = rpart(Outcome ~.,data = train_data, method = "class")
rpart.plot(model,cex = .9)

predictn = predict(model,data= test_data,type = "class")

tab = table(test_data$Outcome,predictn)

标签: rdecision-tree

解决方案


Your response from the test set and predictions have different lengths. I would say that predictions weren't made for all observations (maybe because of missing values of some predictors - for this consider using surrogate variables or deleting the rows which have missing values in these predictors in the test set).

btw, when you are using caret, there is a nice function caret::confusionMatrix()


推荐阅读