首页 > 解决方案 > 如何遍历多个行索引范围以为每个行索引范围创建单独的数据框-R

问题描述

我基本上采用非正常数据集并将其转换为可以加载到 SQL Server 表中的数据集。使用下面的示例代码,是否有更有效的方法来执行此操作,而无需明确列出“ASRN1”的行索引或我想要传播、合并和绑定的数据帧?我有数百个数据集需要循环,有些可能有 3 组 asrn1、service 和 OCR,而另一些可能有 30 组 asrn1、service 和 ocr。

   Columns<-c("SERIVCE ORDER", "SERVICE ORDER DATE", "ASRN1", "SERVICE","OCR","ASRN1","SERVICE","OCR", "ASRN1", "SERVICE", "OCR", "COMMENTS")
Values<-c("peanuts", "06/09/2020","1111", "abcd","xxxx", "2222", "efgh", "yyyy", "3333", "ijkl", "zzzz", "zippitydoda" )

 df <- data.frame(Columns, Values)
  a = which(df$Columns == "ASRN1",arr.ind=FALSE, useNames = TRUE)[1]
  b = which(df$Columns == "ASRN1",arr.ind=FALSE, useNames = TRUE)[2]
  c = which(df$Columns == "ASRN1",arr.ind=FALSE, useNames = TRUE)[3]



 dfa<-spread(unique(df[0:(a-1),]),Columns,Values)
 dfb<-spread(df[a:(b-1),],Columns, Values)
 dfc<-spread(df[b:(c-1),],Columns,Values)
 dfe<-spread(tail(df,-c+1),Columns,Values)

 dff<-merge(dfa,dfb)
 dfg<-merge(dfa,dfc)
 dfh<-merge(dfa,dfe)


 dfj<-dplyr::bind_rows(dff, dfg,dfh)

标签: rloops

解决方案


考虑by列子集对数据框进行子集化,然后构建一个向量列表以cbind在最后调用。这假设多个值的重复是相同的,并且所有其他值出现一次。

# BUILD LIST OF VECTORS
vec_list <- by(df, df$Columns, function(sub) {
    # RENAME COLUMNS
    tmp <- setNames(sub, c("Columns", as.character(sub$Columns[1])))
    # REMOVE FIRST COLUMN
    tmp <- transform(tmp, Columns = NULL)
})

# CBIND ALL DF ELEMENTS
final_df <- do.call(cbind.data.frame, vec_list)
final_df
#   ASRN1    COMMENTS  OCR SERIVCE ORDER SERVICE SERVICE ORDER DATE
# 1  1111 zippitydoda xxxx       peanuts    abcd         06/09/2020
# 2  2222 zippitydoda yyyy       peanuts    efgh         06/09/2020
# 3  3333 zippitydoda zzzz       peanuts    ijkl         06/09/2020

推荐阅读