首页 > 解决方案 > 在 R 中读取所有带有一些工作表的 Excel 文件

问题描述

我有一个非常简单的代码:

data = read_excel("02.01.xlsx", col_names = F)

data <- data %>%
   mutate(direction = c(data[1,3])) %>% 
   tidyr::fill(1) %>% 
   slice(-1:-2) %>% 
   janitor::row_to_names(row_number = 1) %>% 
   purrr::set_names(c("date", "time", "max_price", "max_power", "nominal_power", "direction"))

但我必须将它应用到我文件夹中的每个 Excel 文件和一些工作表(1、2、3、6、7、8、9、11)

我找到了这段代码:

dir_path = "~/Documents/Dixi/Jan/"
re_file <- list.files(path = path, pattern = "*.xls")

read_sheets <- function(dir_path, file){
   xlsx_file <- paste0(dir_path, file)
   xlsx_file %>%
      excel_sheets() %>%
      set_names() %>%
      map_df(read_excel, path = xlsx_file, .id = 'sheet_name') %>% 
      mutate(file_name = file) %>% 
      select(file_name, sheet_name, everything())
}

df <- list.files(dir_path, re_file) %>% 
   map_df(~ read_sheets(dir_path, .))

但如何连接它们?我是 purrr 的新手,这对我来说很难。谢谢!

标签: rexceldplyrpurrr

解决方案


因此,为了使用 read_excel 的默认参数,您可以做一些简单的事情;

library(purrr)

dir_path = "~/Documents/Dixi/Jan/"
re_file <- list.files(path = dir_path, pattern = "*.xls")

# paste0(dir_path, "//", re_file) <- concatenate directory with file name
# readxl::read_excel <- reads data
map_df(paste0(dir_path, "//", re_file), readxl::read_excel)

但是,因为您更了解您的数据并且显然构建了一个函数来处理 read_excel 参数,所以这应该使您的函数工作;

library(readxl)
library(purrr)

dir_path = "~/Documents/Dixi/Jan/"
re_file <- list.files(path = dir_path, pattern = "*.xls")

read_sheets <- function(dir_path, file){
  xlsx_file <- paste0(dir_path, file)
  xlsx_file %>%
    excel_sheets() %>%
    set_names() %>%
    map_df(read_excel, path = xlsx_file, .id = 'sheet_name') %>% 
    mutate(file_name = file) %>% 
    select(file_name, sheet_name, everything())
}

re_file %>%
  map_df(function(x) read_sheets(dir_path = dir_path, x))

推荐阅读