首页 > 解决方案 > R 中的 LIME 库抛出“错误:响应在排列中是恒定的。请检查您的模型”

问题描述

寻找一个善良的灵魂来帮助我用我当前的 RF 模型解决 R 中的这个错误:

Error: Response is constant across permutations. Please check your model

以下是运行代码所需的文件:link

这是我的代码:

library("lime")
library("randomForest")
RF <- readRDS("RF_classifier4sRNA.rds") # Load the model

origTrainingData <- read.csv( "training_combined.csv", header = TRUE, sep = ",") # load Orig Training data

origTrainingDataLabels <- read.csv( "training_combined_labels.csv", header = TRUE, sep = "," ) 
                                                        # load Orig Training data labes
Classification <- origTrainingDataLabels$Class
origTrainingDataWithLabels <- cbind(origTrainingData, Classification)

# instances to explain ----
inputFile <- "FeatureTable.tsv"
testData <- read.table( inputFile, sep = "\t", header = TRUE)
class(testData)

testDataPredictions <- predict(RF, testData, type="prob")
testDataPre
# randomForest
# RF <- readRDS("RF_classifier4sRNA.rds")
# pred <- predict(RF, data, type = "prob")

predict_model.randomForest <- function(x, newdata, type, ...) {
  res <- predict(x, newdata = newdata, ...)
  switch(
    type,
    raw = data.frame(Response = res$class, stringsAsFactors = FALSE),
    prob = as.data.frame(res["posterior"], check.names = FALSE)
  )
}

model_type.randomForest <- function(x, ...) 'classification'

?lime()
lime_explainer <- lime( origTrainingData,      # Original training data
                        RF,                    # The model to explain
                        bin_continuous = TRUE, # Should continuous variables be binned 
                                               # when making the explanation
                        n_bins = 5,           # The number of bins for continuous variables 
                                               # if bin_continuous = TRUE
                        quantile_bins = FALSE  # Should the bins be based on n_bins quantiles
                                               # or spread evenly over the range of the training data
                        )
lime_explanations <- explain( testData,           # Data to explain
                              lime_explainer,     # Explainer to use
                              n_labels = 7,
                              n_features = 7,
                              n_permutations = 10,
                              feature_select = "none"
                            )
lime_explanations

公平地说,我不是原始随机森林模型的作者,可以在这里找到:github 以及完整的文档和所有其他相关文件都可以找到(这里)[ https://peerj.com/articles/ 6304/] 我只是想把石灰应用到那个模型上。

标签: rrandom-forestlime

解决方案


最终,我的教授能够帮助我:D

因此,以下是 LIME 在我的特定用例中的实际功能:

predict_model.randomForest <- function(x, newdata, type, ...) {
  res <- predict(x, newdata = newdata, ...)
  switch(
    type,
    raw = data.frame(Response = ifelse(res[,2] > 0.5, "sRNA", "notSRNA"), 
                     stringsAsFactors = FALSE
    ),
    prob = res 
  )
  print(class(res))
  print(dim(res))
  print(res)
}

model_type.randomForest <- function(x, ...) 'classification'

推荐阅读