首页 > 解决方案 > 泰坦尼克号数据集。逻辑回归模型。混淆矩阵给出 0 作为输出

问题描述

我正在使用以下代码在 Titanic 数据集上运行逻辑回归模型:

#Modeling

#Split into train and test and fit the logistic regression model


titanic_train <- titanic_complete[1:891,]
titanic_test <- titanic_complete[892:1309,]

##############Logistic Regression ###############################

glm_model = glm(Survived~.,data= titanic_train, family = 'binomial')
summary(glm_model)


## Using anova() to analyze the table of devaiance
anova(glm_model, test="Chisq")


final_model = glm(Survived~Sex + Pclass + Age + SibSp + Cabin_f, data = titanic_train, family = 'binomial')
summary(final_model)

varImp(glm_model)

glm.pred <-predict(final_model, titanic_test, type = 'response')
glm.pred <- ifelse(glm.pred > 0.5, "yes", "no")

glm.pred

confusionMatrix(glm.pred, titanic_test$Survived)

结果,我收到了此错误消息:

> confusionMatrix(glm.pred, titanic_test$Survived)
[1] no  yes
<0 rows> (or 0-length row.names)
In Ops.factor(predictedScores, threshold) :
  ‘&lt;’ not meaningful for factors

无法理解此错误消息以及出了什么问题。该模型在火车数据上运行良好。我假设它与我应用于幸存变量的阈值有关(这是一个因子变量 - 1,0)。

标签: logistic-regressionconfusion-matrixmodel-validation

解决方案


推荐阅读