r - excel batch load with specific sheet
问题描述
Im trying to batch load several xlsx files to a single dataframe. So far i have this piece of code which works great.
file.list <- list.files(path = "base-files/", pattern='*.xlsx', full.names = TRUE)
test1 <- sapply(file.list, read_xlsx, simplify=FALSE) %>% bind_rows(.id = "id")
My issue is related with the number of sheets each xlsx file has. I'm specifically looking to load only sheet # 2 from each file. Is there any way i could add a sheet flag into the sapply function?
EDIT: After running your recommended code, i get some errors. I guess i need to install perl?
file.list <- list.files(path = "base-files/", pattern='*.xlsx', full.names = TRUE)
bind_rows(.id = "id")
test1 = lapply(file.list, function(x) {
sheet_no <- if(sheetCount(x) == 1) 1 else 2
read_xlsx(x, sheet = sheet_no)
}) %>% bind_rows(.id = 'id')
Error in findPerl(verbose = verbose) :
perl executable not found. Use perl= argument to specify the correct path.
EDIT Just installed perl for windows https://www.activestate.com/products/perl/downloads/
解决方案
We can specify the sheet
number
sapply(file.list, read_xlsx, sheet = 2, simplify=FALSE) %>%
bind_rows(.id = "id")
According to ?read_xlsx
sheet - Sheet to read. Either a string (the name of a sheet), or an integer (the position of the sheet). Ignored if the sheet is specified via range. If neither argument specifies the sheet, defaults to the first sheet.
There is a sheetCount
function from gdata
library(gdata)
lapply(file.list, function(x) {
sheet_no <- if(sheetCount(x) == 1) 1 else 2
read_xlsx(x, sheet = sheet_no)
}) %>%
bind_rows(.id = 'id')
推荐阅读
- python - 如何在单独的类中导入 Django 模型
- python - 如何在 Django REST Framework 中测试 ValidationError 消息?
- vim - 如何在 VIM 查找/搜索中转义反斜杠和正斜杠?
- jquery - JQuery Validator 在页面加载时提交表单
- c# - 修复 WDP(Web 部署项目)文件
- python - 从具有唯一键和值列表的两列中创建字典
- python - Python字符串移动位置
- c# - 带有 UWP 应用程序的自托管 SQL 数据库 (LocalDB)
- python - 是否有任何功能可以从pdf中提取具有特定标题的文本
- html - 如何调整垫选择高度以适合其项目?