首页 > 解决方案 > 在循环内部而不是外部运行时出现无效时间争论错误

问题描述

我可以在循环之外成功运行这个 xml 解析过程#find all items and store as a list

page <- read_xml ("sample.xml")    
items<-xml_find_all(page, "//d1:REPORT/d1:VEHICLEs/d1:ACRSVEHICLE/d1:PASSENGERs/d1:PASSENGER")

#extract all children's names and values

nodenames<-xml_name(xml_children(items))

contents<-trimws(xml_text(xml_children(items)))

#Need to create an index to associate the nodes/contents with each item

itemindex<-rep(1:length(items), times=sapply(items, function(items) {length(xml_children(items))}))

#store all information in data frame.

df<-data.frame(itemindex, nodenames, contents)

但是,当我尝试将其放入循环中时(目录中有许多 xml,但并非每列都有值),我收到此错误:

build_dfs <- function(xml_file) {

  # PARSE DOC AND ASSIGN XPATHs

  doc <- read_xml(xml_file)

  xpath_occupants <- "//d1:REPORT/d1:VEHICLEs/d1:ACRSVEHICLE/d1:PASSENGERs/d1:PASSENGER"

 

  # PEOPLE TABLE

  items <- xml_find_all(doc, xpath_occupants, xml_ns(doc))  #extract all children's names and values

  nodenames<-xml_name(xml_children(items))

  contents<-trimws(xml_text(xml_children(items)))

  #Need to create an index to associate the nodes/contents with each item

  itemindex<-rep(1:length(items), times=sapply(items, function(items) {length(xml_children(items))}))

  #store all information in data frame.

 

  # RETURN A NAMED LIST OF DATA FRAMES

  list(occ = occ)

}

 

# RETRIEVE XML FILES

xml_files <- list.files(file.path("U:", "tmp", "tst"), pattern = "[.]xml")

 

# PASS EACH XML FILE INTO METHOD

data_list <- lapply(xml_files, build_dfs)

我明白了

Error in rep(1:length(items), times = sapply(items, function(items) { :

  invalid 'times' argument

我无法轻松提供 XML 文件,但此错误的潜在原因是什么?我认为它与itemindex<-rep(1:length(items), times=sapply(items, function(items) {length(xml_children(items))}))命令有关。如何创建索引以在循环中工作的方式将注释和内容与每个项目相关联?

标签: rxmllistloopssapply

解决方案


推荐阅读