首页 > 解决方案 > 将多个csv文件(并在每个csv文件中跳过2列)读入R中的一个数据帧?

问题描述

我有一个包含大约 100 个 csv 文件的文件夹,我想将它们读入 R 中的一个数据框中。我有点知道该怎么做,但我必须跳过每个 csv 文件中的前两列,这就是我被卡住的部分在。到目前为止,我的代码是:

myfiles <- list.files(pattern = ".csv") # create a list of all csv files in the directory
data_csv <- ldply(myfiles, read.csv)

感谢您的任何帮助

标签: r

解决方案


使用data.table包功能fread()rbindlist()将比任何其他basetidyverse替代方案更快地提供您所追求的结果。

library(data.table)

## Create a list of the files
FileList <- list.files(pattern = ".csv")

## Pre-allocate a list to store all of the results of reading
## so that we aren't re-copying the list for each iteration
DTList <- vector(mode = "list", length = length(FileList))

## Read in all the files, excluding the first two columns
for(i %in% seq_along(DTList)) {
  DTList[[i]] <- data.table::fread(FileList[[i]], drop = c(1,2))
}

## Combine the results into a single data.table
DT <- data.table::rbindlist(DTList)

## Optionally, convert the data.table to a data.frame to match requested result
## Though I would recommend looking into using data.table instead!
data.table::setDF(DT)

推荐阅读