首页 > 解决方案 > 使用跨多个 data.frames 的循环整理数据

问题描述

我是第一次尝试理解 R 的初学者,并且在学习 R 时遇到了一系列问题,我无法找到答案。

我遇到的问题如下

  1. 如何创建一个循环来过滤掉我的数据的某些方面?
  2. 如何从文件夹中导入大量 .CSV,然后应用循环来删除额外数据并过滤掉数据,包括将变量名设置为文件名?

这本质上是我想要的过滤器,但我需要将其应用于一系列 .CSV 文件并将其输出到新的 csv 文件。

rainfallg1 = read.csv("120015A.csv",
            stringsAsFactors=FALSE, 
            sep=",",
            rainfall_filter <- rainfallg1[,1:3])

# This section names the columns and numerically codes them making it easy to filter.

names(rainfall_filter)[1] <- "Time_Date"
names(rainfall_filter)[2] <- "Rainfall"
names(rainfall_filter)[3] <- "Code_of_Standard"

rainfall_filter$Rainfall <- as.integer(rainfall_filter$Rainfall)
rainfall_filter$Code_of_Standard <- as.integer(rainfall_filter$Code_of_Standard)

rainfall_filter_1 <- filter(rainfall_filter, Code_of_Standard <= 83) 

标签: r

解决方案


您可以使用/获取所有文件名的列表list.files并应用该函数lapplymap

library(dplyr)
filenames <- list.files(pattern = '\\.csv$', full.names = TRUE)

purrr::map(filenames, ~.x %>% 
                #Read the data
                read.csv(stringsAsFactors=FALSE) %>%
                #Select only first 3 columns
                select(1:3) %>%
                #Rename the columns
                setNames(c('Time_Date', 'Rainfall', 'Code_of_Standard')) %>%
                #Change `Rainfall` and  `Code_of_Standard` columns to integer
                mutate(across(Rainfall, Code_of_Standard), as.integer) %>%
                #keep only rows less than equal to 83 in Code_of_Standard
                filter(Code_of_Standard <= 83) %>%
                #Write the csv file.
                write.csv(paste0(tools::file_path_sans_ext(basename(.x)), 
                          '_new.csv'), row.names = FALSE)
      )

这应该在您的工作目录中写入新文件。如果您的旧文件被调用df1.csv并且df2.csv这将写入df1_new.csvdf2_new.csv.


推荐阅读