首页 > 解决方案 > Rshiny 数据分类和汇总统计错误

问题描述

我是 Rshiny 的新手。我的任务是:

编写一个使用导航栏的闪亮应用程序,标题为“数据探索”和“分类工具”,以便在“数据探索”选项卡中,用户可以:

  1. 选择任何变量并查看它的摘要统计信息,

  2. 使用选择输入按速率类别查看变量的汇总统计信息。

  3. 首次打开应用程序时,按费率组查看 beertax 变量图。

  4. 选择任何变量并可视化它与 Rate 变量的关系。根据所选变量是连续变量还是分类变量,应显示不同的图。

分类工具选项卡中,用户可以:

  1. 使用滑块输入从 (0.4, 0.5, 0.6, 0.7, 0.8) 中选择用于训练数据集的数据比例(我们不会在这里使用测试数据,因此您应该使用补码比例进行验证数据集。)

  2. 查看训练数据的分类树,并使用单选按钮“查看修剪树”或“查看未修剪树”。其中,对于修剪后的树,应该使用与最小 xerror 对应的 cp 值进行修剪。

  3. 查看修剪分类树和 LDA 或 QDA 之一的正确和错误分类率(使用验证数据),这些结果应该在表格中呈现。“最佳”分类方法,即错误分类率最低的分类方法,应该被突出显示,并且应该有一个注释来告知用户突出显示的含义。

  4. 使用“最佳”分类方法,使用用户定义的一组观察到的变量值(即需要输入选项以允许用户输入他们想要的观察值)。打开应用程序时出现的默认用户定义值应该是连续变量的平均值和分类变量的众数。还应该有一个警告,在用户推断时提醒用户。

目前,我几乎已经完成了分类工具选项卡的第 3 步。但是,我遇到了需要帮助的各种错误。我将按顺序讨论它们:

  1. 当我打开 Rstudio 后运行应用程序时,我的第一个选项卡“数据探索”运行良好。但是,每当我重新加载应用程序时,汇总表都会失败并且我收到错误“未使用的参数 (input$variable) ”。我不确定为什么会这样。

  2. 当我运行该应用程序时,第二个选项卡“分类工具”上出现错误,显示为“二进制运算符的非数字参数”。我调查了这意味着什么,我想我理解它,但我只是不确定这个错误如何适用于我的代码。而不是这个错误,我的目标是生成一个表格,其中包括 CART 模型的分类率和错误分类率,最终也是 LDA 模型(取决于输入的训练数据比例)。

  3. 从问题 2 开始,我调查了我的代码并意识到我认为之前的错误是由于lda.pred <- predict(lda.model, newdata = valid.data[,-6])行而发生的,所以我删除了代码看看会发生什么,然后我得到一个新的错误“未使用的参数(pred == valid.label) ”。同样,我不明白为什么会这样。

这是我的代码:

data <- read.csv("Fatality-task2.csv")

data$Rate <- as.factor(data$Rate)

library(shiny)
library(dplyr)
library(ggplot2)
library(markdown)
library(gtsummary)
library(ggdendro)
library(factoextra)
library(mclust)
library(cluster)
library(rpart)
library(rpart.plot)
#library(MASS)

#################################################################


ui <- fluidPage(
  navbarPage("",
             tabPanel("Data Exploration",
                      sidebarLayout(
                        sidebarPanel(
                          selectInput("variable",
                                      "Variable",
                                      colnames(data)),
                          selectInput("rate",
                                      "Rate",
                                      levels(data$Rate))
                        ),
                        mainPanel(
                          tableOutput("table"),
                          plotOutput("plot")
                        )
                      )    
             ),
             tabPanel("Classification tools",
                      sidebarLayout( 
                        sidebarPanel(
                          sliderInput("train.prop",
                                      "Training data proportion",
                                      min = 0.4,
                                      max = 0.8,
                                      step = 0.1,
                                      value = 0.6),
                          radioButtons("prune",
                                       "Pruning option",
                                       choices = c("view pruned tree",
                                                   "view unpruned tree"))
                        ),
                        mainPanel(
                          tableOutput("table2"),
                          plotOutput("plot2")
                          
                        )
                      )
             )
  )
)




#################################################################

server <- function(input, output) {
  sum <- reactive({
    data <- data %>%
      filter(Rate == input$rate) %>%
      select(input$variable) %>%
      summary() %>%
      as.data.frame() %>%
      tidyr::separate(Freq, c("Stat", "Value"), sep=":") %>%
      tidyr::pivot_wider(names_from =Stat, values_from = Value)
    data <- data[, -c(1,2)]
    
  })
  
  output$table <- renderTable({
    sum()
  })
  
  output$plot <- renderPlot({
    if (input$variable == "jaild" | input$variable == "Rate"){
      ggplot(data, aes(x = Rate, fill = data[[input$variable]])) +
        geom_bar(position = "dodge", width = 0.7) +
        if (input$variable == "Rate"){
          theme(legend.position = "none")
        }
    } else {
      ggplot(data, aes(x = Rate, y = data[[input$variable]], fill = Rate)) +
        geom_boxplot() +
        theme(legend.position = "none")
    }
    
    
  })
  
  output$plot2 <- renderPlot({
    
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    if (input$prune == "view pruned tree"){
      rpart.plot(ptree, uniform =TRUE)
    } else {
      rpart.plot(fit.tree)
    }
    
    
    
    
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  #######################
  table <- reactive({
    #################################
    library(MASS)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    #ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    ind2 <- setdiff(c(1:n), ind1)
    
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    train.label <- data[ind1, 6]
    valid.label <- data[ind2, 6]
    
    #################################
    
    ### fit cart model
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ### prune the tree
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    ### predict using the validation data on the pruned tree
    pred <- predict(ptree, newdata = valid.data, type = "class")
    
    ### lda
    
    lda.model <- lda(train.data[,-6], train.label)
    
    #lda.pred <- predict(lda.model, newdata = valid.data[,-6])
    
    
    ### create a classification table
    
    
    
    CCR <- sum(pred == valid.label)/nrow(valid.data)
    MCR <- 1 - CCR
    
    CR <- c(CCR, MCR)
    
    CR <- as.data.frame(CR)
    colnames(CR) <- "CART"
    rownames(CR) <- c("CCR", "MCR")
    
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  #############################
  output$table2 <- renderTable({
    table()
    
  }) 
  
}




#################################################################
shinyApp(ui, server)

在上述 3 种情况下,我的应用程序输出如下所示(按顺序)

在此处输入图像描述

在此处输入图像描述

在此处输入图像描述

我迫切需要帮助,因此不胜感激。如果您需要对代码进行任何说明,请告诉我。

~编辑~

这是解决我的第二个和第三个错误后我更新的代码(我认为?):

data <- read.csv("Fatality-task2.csv")

data$Rate <- as.factor(data$Rate)

library(shiny)
library(dplyr)
library(ggplot2)
library(markdown)
library(gtsummary)
library(ggdendro)
library(factoextra)
library(mclust)
library(cluster)
library(rpart)
library(rpart.plot)
#library(MASS)

dput(head(data))

#################################################################


ui <- fluidPage(
  navbarPage("",
             tabPanel("Data Exploration",
                      sidebarLayout(
                        sidebarPanel(
                          selectInput("variable",
                                      "Variable",
                                      colnames(data)),
                          selectInput("rate",
                                      "Rate",
                                      levels(data$Rate))
                        ),
                        mainPanel(
                          tableOutput("table"),
                          plotOutput("plot")
                        )
                      )
             ),
             tabPanel("Classification tools",
                      sidebarLayout(
                        sidebarPanel(
                          sliderInput("train.prop",
                                      "Training data proportion",
                                      min = 0.4,
                                      max = 0.8,
                                      step = 0.1,
                                      value = 0.6),
                          radioButtons("prune",
                                       "Pruning option",
                                       choices = c("view pruned tree",
                                                   "view unpruned tree"))
                        ),
                        mainPanel(
                          tableOutput("table2"),
                          plotOutput("plot2")
                          
                        )
                      )
             )
  )
)




#################################################################

server <- function(input, output) {
  sum <- reactive({
    req(input$variable,input$rate)
    data <- data %>%
      filter(Rate == input$rate) %>%
      select(input$variable) %>%
      summary() %>%
      as.data.frame() %>%
      tidyr::separate(Freq, c("Stat", "Value"), sep=":") %>%
      tidyr::pivot_wider(names_from =Stat, values_from = Value)
    data <- data[, -c(1,2)]
    
  })
  
  output$table <- renderTable({
    sum()
  })
  
  output$plot <- renderPlot({
    req(input$variable)
    if (input$variable == "jaild" | input$variable == "Rate"){
      ggplot(data, aes(x = Rate, fill = .data[[as.name(input$variable)]])) +
        geom_bar(position = "dodge", width = 0.7) +
        if (input$variable == "Rate"){
          theme(legend.position = "none")
        }
    } else {
      ggplot(data, aes(x = Rate, y = .data[[as.name(input$variable)]], fill = Rate)) +
        geom_boxplot() +
        theme(legend.position = "none")
    }
    
    
  })
  
  output$plot2 <- renderPlot({
    req(input$train.prop,input$prune)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*as.numeric(input$train.prop)))
    ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    if (input$prune == "view pruned tree"){
      rpart.plot(ptree, uniform =TRUE)
    } else {
      rpart.plot(fit.tree)
    }
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  #######################
  table <- reactive({
    #################################
    library(MASS)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    #ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    ind2 <- setdiff(c(1:n), ind1)
    
    train.data <- data[ind1,]
    valid.data <- data[ind2,]
    
    #################################
    
    ### fit cart model
    fit.tree <- rpart(Rate~., data = train.data, method = "class")
    
    ### prune the tree
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])
    
    ### predict using the validation data on the pruned tree
    pred <- predict(ptree, newdata = valid.data[,-6], type = "class")
    
    ### lda
    
     #lda.model <- lda(train.data[,-6], train.data[,6])
     
     lda.model <- lda(Rate~., data = train.data)
     
    
     lda.pred <- predict(lda.model, newdata = valid.data)
    
    
    ### create a classification table
    
    length(lda.model)
    
    
    x <- pred == valid.data[,6]
    
    CCR <- length(x[x == TRUE])/nrow(valid.data)
    MCR <- 1 - CCR
    
    CR <- c(CCR, MCR)
    
    z <- lda.pred$class == valid.data[,6]
    
    lda.CCR <- length(z[z == TRUE])/nrow(valid.data)
    lda.MCR <- 1 - CCR
    
    lda.CR <- c(lda.CCR, lda.MCR)
    
    y <- cbind(CR, lda.CR)
    
    y <- as.data.frame(y)
    colnames(y) <- c("CART", "LDA")
    rownames(y) <- c("CCR", "MCR")
    y
  })
  
  
  
  
  
  
  
  
  
  
  
  
  
  #############################
  output$table2 <- renderTable({
    table()
    
  },
  rownames = TRUE) 
  
}




#################################################################
shinyApp(ui, server)

但是,当我上次解决问题时,我的应用程序没有保存,所以我不得不尝试从内存中再次解决错误。我想我做得对。

这也是我的数据片段,

在此处输入图像描述

"beertax","jaild","vmiles","unrate","perinc","Rate" 1.53937947750092,"no",7.23388720703125,14.3999996185303,10544.15234375,1 1.78899073600769,"no",7.83634765625,13.6999998092651,10732.7978515625,1 1.71428561210632 ,"no",8.262990234375,11.1000003814697,11108.791015625,1 1.65254235267639,"no",8.7269169921875,8.89999961853027,11332.626953125,1 1.60990703105927,"no",8.952853515625,9.80000019073486,11661.5068359375,1 1.55999994277954,"no",9.1663017578125,7.80000019073486,11944, 1 1.50144362449646,"no",9.6743232421875,7.19999980926514,12368.6240234375,1 0.214797139167786,"yes",6.81015673828125,9.89999961853027,12309.0693359375,1 0.206422030925751,"yes",6.58749462890625,9.10000038146973,12693.8076171875,1 0.296703308820724,"是",6.70997021484375,5,13265.93359375,1

在此处输入图像描述

标签: rshinytreeclassificationsummary

解决方案


使用req()andas.numeric()应该可以解决前两个问题。在此之后,您应该能够修复最后一个。

编辑

更改select(input$variable)dplyr::select(input$variable)应该消除您的第一个错误。在您的软件包列表中,还有 4 个具有相同功能select()的软件包;因此,您需要指定您打算从哪个包中使用或dplyr最后加载它。

df1 <- read.table(text='"beertax","jaild","vmiles","unrate","perinc","Rate"
1.53937947750092,"no",7.23388720703125,14.3999996185303,10544.15234375,1
1.78899073600769,"no",7.83634765625,13.6999998092651,10732.7978515625,1
1.71428561210632,"no",8.262990234375,11.1000003814697,11108.791015625,1
1.65254235267639,"no",8.7269169921875,8.89999961853027,11332.626953125,1
1.60990703105927,"no",8.952853515625,9.80000019073486,11661.5068359375,1
1.55999994277954,"no",9.1663017578125,7.80000019073486,11944,1
1.50144362449646,"no",9.6743232421875,7.19999980926514,12368.6240234375,1
0.214797139167786,"yes",6.81015673828125,9.89999961853027,12309.0693359375,1
0.206422030925751,"yes",6.58749462890625,9.10000038146973,12693.8076171875,1
0.296703308820724,"yes",6.70997021484375,5,13265.93359375,1', header=TRUE, sep=",")

df2 <- df1 %>% transform(Rate=2)
data<- rbind(df1,df2)
data$Rate <- as.factor(data$Rate)

library(shiny)
library(dplyr)
library(ggplot2)
library(markdown)
library(gtsummary)
library(ggdendro)
library(factoextra)
library(mclust)
library(cluster)
library(rpart)
library(rpart.plot)
library(MASS)

ui <- fluidPage(
  navbarPage("Testing Data Exploration",
             tabPanel("Data Exploration",
                      sidebarLayout(
                        sidebarPanel(
                          selectInput("variable",
                                      "Variable",
                                      colnames(data)),
                          selectInput("rate",
                                      "Rate",
                                      levels(data$Rate))
                        ),
                        mainPanel(
                          DTOutput("table1"),
                          plotOutput("plot")
                        )
                      )
             ),
             tabPanel("Classification tools",
                      sidebarLayout(
                        sidebarPanel(
                          sliderInput("train.prop",
                                      "Training data proportion",
                                      min = 0.4,
                                      max = 0.8,
                                      step = 0.1,
                                      value = 0.6),
                          radioButtons("prune",
                                       "Pruning option",
                                       choices = c("view pruned tree",
                                                   "view unpruned tree"))
                        ),
                        mainPanel(
                          DTOutput("table2"),
                          plotOutput("plot2")

                        )
                      )
             )
  )
)


server <- function(input, output) {
  summ <- reactive({
    req(input$variable,input$rate)

    data1 <- data %>%
      filter(Rate == input$rate) %>%
      dplyr::select(input$variable) %>%
      summary() %>%
      as.data.frame() %>%
      tidyr::separate(Freq, c("Stat", "Value"), sep=":") %>%
      tidyr::pivot_wider(names_from = Stat, values_from = Value)
    data2 <- data1[, -c(1,2)]
    data2
  })

  output$table1 <- renderDT({
    summ()
  })

  output$plot <- renderPlot({
    req(input$variable)
    if (input$variable == "jaild" | input$variable == "Rate"){
      ggplot(data, aes(x = Rate, fill = .data[[as.name(input$variable)]])) +
        geom_bar(position = "dodge", width = 0.7) +
        if (input$variable == "Rate"){
          theme(legend.position = "none")
        }
    } else {
      ggplot(data, aes(x = Rate, y = .data[[as.name(input$variable)]], fill = Rate)) +
        geom_boxplot() +
        theme(legend.position = "none")
    }
  })

  output$plot2 <- renderPlot({
    req(input$train.prop,input$prune)
    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*as.numeric(input$train.prop)))
    ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))


    train.data <- data[ind1,]
    valid.data <- data[ind2,]


    fit.tree <- rpart(Rate~., data = train.data, method = "class")

    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])

    if (input$prune == "view pruned tree"){
      rpart.plot(ptree, uniform =TRUE)
    } else {
      rpart.plot(fit.tree)
    }
  })
  #######################
  table <- reactive({

    set.seed(1234)
    n <- nrow(data)
    ind1 <- sample(c(1:n), round(n*input$train.prop))
    #ind2 <- sample(c(1:n)[-ind1], length(c(1:n)[-ind1]))
    ind2 <- setdiff(c(1:n), ind1)

    train.data <- data[ind1,]
    valid.data <- data[ind2,]

    #################################

    ### fit cart model
    fit.tree <- rpart(Rate~., data = train.data, method = "class")

    ### prune the tree
    ptree <- prune(fit.tree, cp = fit.tree$cptable[which.min(fit.tree$cptable[,"xerror"]),"CP"])

    ### predict using the validation data on the pruned tree
    pred <- predict(ptree, newdata = valid.data[,-6], type = "class")

    ### lda

    #lda.model <- lda(train.data[,-6], train.data[,6])

    lda.model <- lda(Rate~., data = train.data)


    lda.pred <- predict(lda.model, newdata = valid.data)

    ### create a classification table

    length(lda.model)


    x <- pred == valid.data[,6]

    CCR <- length(x[x == TRUE])/nrow(valid.data)
    MCR <- 1 - CCR

    CR <- c(CCR, MCR)

    z <- lda.pred$class == valid.data[,6]

    lda.CCR <- length(z[z == TRUE])/nrow(valid.data)
    lda.MCR <- 1 - CCR

    lda.CR <- c(lda.CCR, lda.MCR)

    y <- cbind(CR, lda.CR)

    y <- as.data.frame(y)
    colnames(y) <- c("CART", "LDA")
    rownames(y) <- c("CCR", "MCR")
    y
  })

  output$table2 <- renderDT({
    table()
  }, rownames = TRUE)

}

shinyApp(ui, server)

输出


推荐阅读