首页 > 解决方案 > R:如何在多个csv中提取列,然后在一个文件夹中写入多个csv

问题描述

我有一个包含多个 csv 的文件夹(文件夹 1):“x.csv”、“y.csv”、“z.csv”... 我想提取每个文件的第 3 列,然后将新的 csv 文件写入新文件夹(文件夹 2)。因此,文件夹 2 必须包含“x.csv”、“y.csv”、“z.csv”...(但只有第 3 列)。

我试过这个:

dfiles <- list.files(pattern =".csv") #if you want to read all the files in working directory
lst2 <- lapply(dfiles, function(x) (read.csv(x, header=FALSE)[,3]))

但我得到了这个错误:

 Error in `[.data.frame`(read.csv(x, header = FALSE), , 3) : 
  undefined columns selected 

此外,我不知道如何编写多个 csv。

但是,如果我使用一个文件执行此操作,它可以正常工作,尽管输出位于同一文件夹中:

essai <-read.csv("x.csv", header = FALSE, sep = ",")[,3]
write.csv (essai, file = "x.csv")

任何帮助,将不胜感激。

标签: rloopscsv

解决方案


so here's how I would do it. There may be a nicer and more efficient way but it should still work pretty well.

setwd("~/stackexchange") #set your main folder. Best way to do this is actually the here() package. But that's another topic.


library(tools) #for file extension tinkering
folder1 <- "folder1" #your original folder
folder2 <- "folder2" #your new folder

#I setup a function and loop over it with lapply.
write_to <- function(file.name){
file.name <-  paste0(tools::file_path_sans_ext(basename(file.name)), ".csv")
essai <-read.csv(paste(folder1, file.name, sep = "/"), header = FALSE, sep = ",")[,3]
write.csv(essai, file = paste(folder2, file.name, sep="/")) 
}


# get file names from folder 1
dfiles <- list.files(path=folder1, pattern ="*.csv") #if you want to read all the csv files in folder1 directory

lapply(X = paste(folder1, dfiles, sep="/"), write_to)

Have fun! Btw: if you have many files, you could use data.table::fread and data.table::fwrite which improves csv reading/writing speed by a lot.


推荐阅读