首页 > 解决方案 > 我正在尝试将大型 csv 按行拆分为单独的 .txt 文件,在 R 中的每个 .txt 中都有一个标题

问题描述

我正在尝试为 R studio 中的 .csv 中的每一行创建一个单独的 .txt 文件。我找到了 csv2txt 函数,但我不知道如何编辑它以保留每个 .txt 中的标题信息。

使用以下代码:

csv2txt <- function(mydir, labels = 1){
  mycsvfile <- list.files(mydir, full.names = TRUE, pattern = "*.CSV|.csv")
  mycsvdata <- read.csv(mycsvfile)
  mytxtsconcat <- apply(mycsvdata[-(1:labels)], 1, paste, collapse=" ")
  mytxtsdf <- data.frame(filename = mycsvdata[,labels], # get the first col for the text file names
                         fulltext = mytxtsconcat) 
  setwd(mydir)
  invisible(lapply(1:nrow(mytxtsdf), function(i) write.table(mytxtsdf[i,2], 
                                                             file = paste0(mytxtsdf[i,1], ".txt"),
                                                             row.names = FALSE, col.names = FALSE,
                                                             quote = FALSE)))
  message(paste0("Your text files can be found in ", getwd()))
}

我得到如下所示的输出:

HILTON - ABERGIS CAYE AMBERGIS CAYE, BELIZE  0.47 0.35 0.31 0.82 0.74 0.52 0.69 0.88 0.71 0.88 0.68

.csv 在顶部有这个标题:

Hotel   Area    Overall Satisfaction for Location   Overall Property Satisfaction   Property Appearance Add'tl Item Working Order   Property Maintenance    Staff Knowledge Staff Interaction   Safety/Security Check In/Out    Invoice Accuracy    Bed Quality

我想在每个 .txt 中都有。

有谁知道我将如何编辑代码来做到这一点?或者知道可以执行此操作的功能?

谢谢!

标签: rcsvsplit

解决方案


如果其他人遇到类似的问题,这就是我最终要做的。

我将标题强制放入 col.names,这并不理想,因为它们在 .txt 中没有完全对齐。但我的解决方法是添加 | 在元素之间,所以我可以在 excel 中打开 .txt,用自定义分隔符分隔,然后你最终得到正确对齐的列。

代码如下。

# Make the .csv into separate .txt files
csv2txt <- function(mydir, labels = 1){
  
  # Get the names of all the CSV file
  mycsvfile <- list.files(mydir, full.names = TRUE, pattern = "*.CSV|.csv")
  
  # Read the actual contexts of the text files into R and rearrange a little.
  
  # create a list of dataframes containing the text
  mycsvdata <- read.csv(mycsvfile)
  
  # combine all except the first column together into
  # one long character string for each row
  mytxtsconcat <- apply(mycsvdata[-(1:labels)], 1, paste, collapse=" | ")
  
  # make a dataframe with the file names and texts
  mytxtsdf <- data.frame(filename = mycsvdata[,labels], # get the first col for the text file names
                         fulltext = mytxtsconcat)
  
  # Now write one text file for each row of the csv
  # use 'invisible' so we don't see anything in the console
  
  setwd(mydir)
  invisible(lapply(1:nrow(mytxtsdf), function(i) write.table(mytxtsdf[i,2], 
                                                             file = paste0(mytxtsdf[i,1], ".txt"),
                                                             row.names = FALSE, col.names = " HOTEL (Q15 1) | METRO AREA STATE (Q10 1)  | Overall Location Satisfaction | Overall Property Satisfaction | Property Appearance | Add'tl Item Working Order | Property Maintenance | Staff Knowledge | Staff Interaction | Safety/Security | Check In/Out | Invoice Accuracy | Bed Quality",
                                                             quote = FALSE)))
  
  # now check your folder to see the txt files
  message(paste0("Your text files can be found in ", getwd()))
}

推荐阅读