首页 > 解决方案 > 我正在尝试使用 ROSE 来帮助采样不平衡。我的 ovun.sample 代码正在创建空值,我该如何解决?

问题描述

我正在尝试使用 ROSE 来帮助处理不平衡的数据集。我在那里大约 90%,但我的 ovun.sample 代码有问题。当我运行 ovun.sample 代码时,它不会创建“over”、“under”或“both”数据集,这些值在 R 中显示为 NULL(空),而不是数据。对于如何解决此问题,我将不胜感激!

set.seed(123)
ind <- sample(2, nrow(credit), replace = TRUE, prob = c(0.7, 0.3))
train <- credit[ind==1,]
test <- credit[ind==2,]


# Data for Developing the Predictive Model
table(train$DEFAULT)
prop.table(table(train$DEFAULT))
summary(train)


# Sample balancing (Over-, Under-, Both)
library(ROSE)
over <- ovun.sample(DEFAULT~., data = train, method = "over", N = 996)$credit
table(over$DEFAULT)

under <- ovun.sample(DEFAULT~., data = train, method = "under", N = 414)$credit
table(under$DEFAULT)

both <- ovun.sample(DEFAULT~., data = train, method = "both",
                    p = 0.5, seed = 213, N = 705)$credit
table(both$DEFAULT)


# Predictive Model (Random Forest)
library (randomForest)
rftrain <- randomForest(DEFAULT~., data = train,
                        ntree = 500, mtry = 10)
rfover <- randomForest(DEFAULT~., data = over,
                       ntree = 500, mtry = 10)
rfunder <- randomForest(DEFAULT~., data = under,
                        ntree = 500, mtry = 10)
rfboth <- randomForest(DEFAULT~., data = both,
                        ntree = 500, mtry = 10)


# Predictive Model Evaluation with test Data
library(caret)
confusionMatrix(predict(rftrain, test), test$DEFAULT, positive = '1')
confusionMatrix(predict(rfover, test), test$DEFAULT, positive = '1')
confusionMatrix(predict(rfunder, test), test$DEFAULT, positive = '1')
confusionMatrix(predict(rfboth, test), test$DEFAULT, positive = '1')```

标签: rsamplingdownsamplingoversampling

解决方案


我遇到过同样的问题。尝试按如下方式更改代码:

over <- ovun.sample(DEFAULT~., data = train, method = "over", N = 996)$data

同样地改变它的其余部分。有用。


推荐阅读