首页 > 解决方案 > mutate() 返回“找不到对象”的错误

问题描述

我正在尝试清理并在我的名为Volumeusing的数据中添加一个新列mutate()

这是我读入 R 的数据:

> df1 <- file.choose()
> data1 <- read_excel(df1)                                                                                                                                   
> head(data1)
# A tibble: 5 x 3
  `product id` amount `total sales`
  <chr>         <dbl>         <dbl>
1 X180             20           200
2 X109             30           300
3 X918             20           200
4 X273             15           150
5 X988             12           120

接下来,我对列进行子集化并重命名为product idtotal salesProduct Code然后Net Sales分别应用mutate()我自己的函数Net Sales并创建一个新Volume列。

> data2 <- data1 %>% 
+   select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
+   replace_na(list(`Net Sales` = 0))%>%
+   arrange(desc(`Net Sales`))%>%
+   mutate(Volume = rank_volume(data1, `Net Sales`))

这是我收到的错误消息:

 Error: Problem with `mutate()` column `Volume`.
ℹ `Volume = rank_volume(data1, `Net Sales`)`.
x arrange() failed at implicit mutate() step. 
* Problem with `mutate()` column `..1`.
ℹ `..1 = Net Sales`.
x object 'Net Sales' not found

这是rank_volume我创建的功能

### a function to label the products that are top one third in total sales as "H", products with the lowest third in sales as "L", and the rest as "M"
rank_volume <- function(data, column) {
  
  column <- ensym(column)
  colstr <- as_string(column)
  data <- arrange(data, desc(!!column))
  size <- length(data[[colstr]])
  first_third <- data[[colstr]][round(size / 3)]
  last_third <- data[[colstr]][round(size - (size / 3))]
  
  case_when(data[[colstr]] > first_third ~ "H",
            data[[colstr]] < last_third ~ "L",
            TRUE ~ "M")
}

当我使用一个简单的数据框单独运行我的函数时,它可以完美运行。但是,当我使用 mutate() 运行它时,出现了错误。我找不到问题。任何人都可以帮忙吗?

编辑:dput(head(data))

> dput(head(data1))
structure(list(`product id` = c("X180", "X109", "X918", "X273", 
"X988"), amount = c(20, 30, 20, 15, 12), `total sales` = c(200, 
300, 200, 150, 120)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

标签: rdplyr

解决方案


data1没有Net Sales列,它存在于您所做的转换中。您可以使用.来引用管道中的当前数据框。

library(dplyr)

data1 %>% 
     select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
     replace_na(list(`Net Sales` = 0))%>%
     arrange(desc(`Net Sales`)) %>%
     mutate(Volume = rank_volume(., `Net Sales`))

# `Product Code` `Net Sales` Volume
#  <chr>                <dbl> <chr> 
#1 X109                   300 H     
#2 X180                   200 M     
#3 X918                   200 M     
#4 X273                   150 L     
#5 X988                   120 L     

或者也可以使用cur_data()-

data1 %>% 
     select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
     replace_na(list(`Net Sales` = 0))%>%
     arrange(desc(`Net Sales`)) %>%
     mutate(Volume = rank_volume(cur_data(), `Net Sales`))

推荐阅读