首页 > 解决方案 > 在 python 2.6.6 中获取子子项的子项值

问题描述

我有像下面这样的 XML,并且想提取带有附件名称的警报 ID。

<alarms formatVersion="1">
  <alarm id="4">
    <startDate>2018-06-19 08:10:05.0 UTC</startDate>
    <alarmDate>2018-06-19 08:10:05.0 UTC</alarmDate>
    <type>1234567</type>
    <intense>50</intense>
    <attachments>
      <attachment filename="20180619.partials.55.1234567.1.csv.gz" mimeType="text/csv"/>
    </attachments>
  </alarm>
  <alarm id="5">
    <startDate>2018-05-19 09:10:05.0 UTC</startDate>
    <alarmDate>2018-05-19 08:10:05.0 UTC</alarmDate>
    <type>1234567</type>
    <intense>50</intense>
    <attachments>
        <attachment filename="20180519.payers.12.1015500.1.csv.gz" mimeType="text/csv"/>
    </attachments>
  </alarm>
  <alarm id="5">
    <startDate>2018-05-19 09:10:05.0 UTC</startDate>
    <alarmDate>2018-05-19 08:10:05.0 UTC</alarmDate>
    <type>1234567</type>
    <intense>50</intense>
  </alarm>
</alarms>

代码尝试:

import xml.etree.ElementTree as ET
import gzip

input=gzip.open('input-xml.gz','r')
tree=ET.parse(input)
root=tree.getroot()


 for lsofals in root.findall("./alarm/"):
         print lsofals.attrib
         for atts in lsofals.findall('attachments'):
         print atts.getchildren()
               for aname in atts.findall('attachment filename'):
                       print aname.attrib

所需的样本输出:

{4: 20180619.partials.55.1234567.1.csv.gz, 5:20180519.payers.12.1015500.1.csv.gz}

使用当前代码,我能够获得警报的值,但无法获得附件的值,我被困在 python 的新手中。检索附件的值后,我需要形成一个字典,在解决此问题后我将工作一次。

标签: pythonelementtree

解决方案


使用简单的元素路径

import xml.etree.ElementTree as ET
import gzip

input = gzip.open('input-xml.gz','r')
tree = ET.parse(input)
root = tree.getroot()

for att in root.findall("./alarm/attachments/attachment"):
    print(att.get('filename'))

输出:

20180619.partials.55.1234567.1.csv.gz
20180519.payers.12.1015500.1.csv.gz

如果您需要将其作为字典获取:

...
d = {}
for alarm in root.findall("./alarm"):
    for att in alarm.findall("attachments/attachment"):
        d[alarm.get('id')] = att.get('filename')

print(d)

输出:

{'4': '20180619.partials.55.1234567.1.csv.gz', '5': '20180519.payers.12.1015500.1.csv.gz'}

推荐阅读