首页 > 解决方案 > 在递归嵌套的 XML 中获得最高级别的元素嵌入

问题描述

对于任意递归嵌套的 XML 中的每个元素,我需要找到它的最大嵌入级别。

例如对于这个 XML

<chorus>
    <l>Alright now lose it <ah>aah <i>aah <ah>a<ah>a</ah>h</ah> aah</i> aah</ah></l>
    <l>Just lose it aah aah aah aah aah</l>
    <l>Go crazy aah aah aah aah aah</l>
    <l>Oh baby <ah>aah aah</ah>, oh baby baby <ah>aah aah</ah></l>
</chorus>

输出应如下所示:{"chorus": 0, "l": 0, "ah": 2, "i": 0}

不幸的是,该解决方案仅限于使用xml.etree.ElementTree.

我尝试了几个小时不同的方法,但我无法理解它。

标签: pythonpython-3.xxmlxml-parsingelementtree

解决方案


您可以使用文档中此示例的修改版本:

在此处输入图像描述

尝试使用元素名称(标签)作为键更改maxDepth和字典...depth

Python

from xml.etree.ElementTree import XMLParser


class MaxDepth:  # The target object of the parser
    maxDepth = {}
    depth = {}

    def start(self, tag, attrib):  # Called for each opening tag.
        try:
            self.depth[tag] += 1
        except KeyError:
            self.depth[tag] = 0
            self.maxDepth[tag] = 0
        if self.depth[tag] > self.maxDepth[tag]:
            self.maxDepth[tag] = self.depth[tag]

    def end(self, tag):  # Called for each closing tag.
        self.depth[tag] -= 1

    def data(self, data):
        pass  # We do not need to do anything with data.

    def close(self):  # Called when all data has been parsed.
        return self.maxDepth


target = MaxDepth()
parser = XMLParser(target=target)
exampleXml = """
<chorus>
    <l>Alright now lose it <ah>aah <i>aah <ah>a<ah>a</ah>h</ah> aah</i> aah</ah></l>
    <l>Just lose it aah aah aah aah aah</l>
    <l>Go crazy aah aah aah aah aah</l>
    <l>Oh baby <ah>aah aah</ah>, oh baby baby <ah>aah aah</ah></l>
</chorus>"""
parser.feed(exampleXml)
print(parser.close())

输出

{'chorus': 0, 'l': 0, 'ah': 2, 'i': 0}

已编辑的 Python(其中chorus已经是一个ElementTree.Element对象)

import xml.etree.ElementTree as ET
from xml.etree.ElementTree import XMLParser


class MaxDepth:  # The target object of the parser
    maxDepth = {}
    depth = {}

    def start(self, tag, attrib):  # Called for each opening tag.
        try:
            self.depth[tag] += 1
        except KeyError:
            self.depth[tag] = 0
            self.maxDepth[tag] = 0
        if self.depth[tag] > self.maxDepth[tag]:
            self.maxDepth[tag] = self.depth[tag]

    def end(self, tag):  # Called for each closing tag.
        self.depth[tag] -= 1

    def data(self, data):
        pass  # We do not need to do anything with data.

    def close(self):  # Called when all data has been parsed.
        return self.maxDepth


exampleXml = """
<chorus>
    <l>Alright now lose it <ah>aah <i>aah <ah>a<ah>a</ah>h</ah> aah</i> aah</ah></l>
    <l>Just lose it aah aah aah aah aah</l>
    <l>Go crazy aah aah aah aah aah</l>
    <l>Oh baby <ah>aah aah</ah>, oh baby baby <ah>aah aah</ah></l>
</chorus>"""

chorus_element = ET.fromstring(exampleXml)

target = MaxDepth()
parser = XMLParser(target=target)
parser.feed(ET.tostring(chorus_element))
print(parser.close())

推荐阅读