python-3.x - Python3遍历可变长度的XML文件
问题描述
我有一个非常嵌套的 XML 文件,我必须对其进行迭代以提取记录。我已经按照一些示例来阅读 XML,并且我确信 XML 是固定长度的,但经过一些提取后我发现它不是。这是我的代码:
import xml.etree.ElementTree as ET
tree = ET.parse('EcommProdotti.xml')
root = tree.getroot()
print("Printing on file...")
with open("prodotti.txt", "w") as f:
for child in root:
for element in child.iter('Products'):
for sub_element in element.iter('Product'):
length = len(sub_element) + 1
my_string = sub_element[1].text + " " + sub_element[2].text + " " + sub_element[9].text + "\n"
f.write(my_string)
如您所见,我的记录位于 sub_element 节点中,这可能是可变的,根据以下 XML 文件示例:
<?xml version="1.0" encoding="UTF-8"?>
<!-- File in formato Easyfatt-XML creato con Danea Easyfatt - www.danea.it/software/easyfatt -->
<!-- Per importare o creare un file in formato Easyfatt-Xml, consultare la documentazione tecnica: www.danea.it/software/easyfatt/xml -->
<EasyfattProducts AppVersion="2" Creator="Danea Easyfatt Enterprise One 2019.45d" CreatorUrl="http://www.danea.it/software/easyfatt" Mode="full" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://www.danea.it/public/prodotti.xsd">
<Products>
<Product>
<InternalID>35</InternalID>
<Code>00035</Code>
<Description>12 PEZZI ROTOLO SACCHETTO IGIENICO PER CANI</Description>
<DescriptionHtml></DescriptionHtml>
<Category>ACCESSORI PER ANIMALI</Category>
<Subcategory>PRODOTTI IGIENE E PULIZIA</Subcategory>
<Vat Perc="22" Class="Imponibile" Description="Imponibile 22%">22</Vat>
<Um>PZ</Um>
<NetPrice1>2.46</NetPrice1>
<GrossPrice1>3</GrossPrice1>
<Barcode>8805786177364</Barcode>
<SupplierCode>0004</SupplierCode>
<SupplierName>LED STORM SRLS</SupplierName>
<SupplierNetPrice>1.898</SupplierNetPrice>
<SupplierGrossPrice>2.3156</SupplierGrossPrice>
<SizeUm>cm</SizeUm>
<WeightUm>kg</WeightUm>
<GrossWeight>0.2</GrossWeight>
<ManageWarehouse>true</ManageWarehouse>
<OrderWaitDays>10</OrderWaitDays>
<AvailableQty>6</AvailableQty>
<Notes></Notes>
</Product>
<Product>
<InternalID>1155</InternalID>
<Code>01144</Code>
<Description>ADVANCE CANE ATOPIC MEDIUM/MAXI 3 KG</Description>
<DescriptionHtml></DescriptionHtml>
<Category>ALIMENTI PER ANIMALI</Category>
<Subcategory>ALIMENTI CURATIVI (DIETETICI)</Subcategory>
<Vat Perc="22" Class="Imponibile" Description="Imponibile 22%">22</Vat>
<Um>PZ</Um>
<NetPrice1>20.48</NetPrice1>
<GrossPrice1>24.99</GrossPrice1>
<Barcode>8410650170695</Barcode>
<ProducerName>Affinity</ProducerName>
<SupplierCode>0033</SupplierCode>
<SupplierName>LOCONTE VITO & C. S.A.S.</SupplierName>
<SupplierProductCode>ADV924483</SupplierProductCode>
<SupplierNetPrice>13.0438</SupplierNetPrice>
<SupplierGrossPrice>15.9134</SupplierGrossPrice>
<SizeUm>cm</SizeUm>
<WeightUm>kg</WeightUm>
<GrossWeight>3</GrossWeight>
<ManageWarehouse>true</ManageWarehouse>
<AvailableQty>8</AvailableQty>
<Notes></Notes>
</Product>
<Product>
<InternalID>203</InternalID>
<Code>00198</Code>
<Description>ADVANTIX CANE FINO A 4 KG</Description>
<DescriptionHtml></DescriptionHtml>
<Category>ACCESSORI PER ANIMALI</Category>
<Subcategory>ANTIPARASSITARI</Subcategory>
<Vat Perc="10" Class="Imponibile" Description="Imponibile 10%">10</Vat>
<Um>PZ</Um>
<NetPrice1>19.82</NetPrice1>
<GrossPrice1>21.8</GrossPrice1>
<Barcode>4007221009597</Barcode>
<ProducerName>Bayer</ProducerName>
<SupplierCode>0033</SupplierCode>
<SupplierName>LOCONTE VITO & C. S.A.S.</SupplierName>
<SupplierProductCode>BYR03382048</SupplierProductCode>
<SupplierNetPrice>16.25</SupplierNetPrice>
<SupplierGrossPrice>17.875</SupplierGrossPrice>
<SizeUm>cm</SizeUm>
<WeightUm>kg</WeightUm>
<GrossWeight>0.04</GrossWeight>
<ManageWarehouse>true</ManageWarehouse>
<OrderWaitDays>10</OrderWaitDays>
<AvailableQty>10</AvailableQty>
<Notes></Notes>
<ExtraBarcodes>
<Barcode>103629046</Barcode>
<Barcode>4007221046424</Barcode>
</ExtraBarcodes>
</Product>
</EasyfattProducts>
那么,我该如何遍历这个文件呢?提前感谢您的时间和帮助
解决方案
您可以使用XPath查询来避免直接项索引:
import xml.etree.ElementTree as ET
tree = ET.parse('EcommProdotti.xml')
root = tree.getroot()
for product in root.findall(".//Products/Product"):
for field in ['Code', 'Description', 'GrossPrice1','SupplierProductCode']:
value = product.find(field)
if value != None:
print (value.text, end=' ')
else:
print ('Not defined', end=' ')
print()
推荐阅读
- python - 注销时运行 pyautogui
- javascript - 带有用于 HTML 表的单个 javascript 文件的简单数据表
- python - 如何将两列数据框与 Nan 值结合起来?
- android - Gridview.builder 不接受列表
在颤振中 - c# - 我是.NET 的新手。我对理解术语有一些疑问。.NET core 和 .NET 5 有什么区别?
- javascript - Python list 如何识别给定的数字是否在范围内挑战问题
- azure - Azure arm 输出应用服务的 IP
- javascript - 如何从 JAVASCRIPT 框架代码访问 JBOSS 系统属性?
- flutter - 如何在flutter android和iOS中获取用户手机号码
- nuxt.js - 如何进行 nuxt-content 提取连接(或深度提取)?