python - 如何删除与python中的tagName和属性匹配的所有XML标签
问题描述
我有一个大的扁平 xsd 文件,其中每个标签前面都有“xs:Something”。我已经在我的扁平化 XML 中编译了一个未使用类型的列表,我想要一种自动化的方法来删除开始、结束标记以及介于两者之间的所有内容。
示例 XSD:
<!--W3C XML Schema generated by XMLSpy v2019 rel. 3 sp1 (x64) (http://www.altova.com)-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.opentravel.org/OTA/2003/05" targetNamespace="http://www.opentravel.org/OTA/2003/05" elementFormDefault="qualified">
<xs:annotation>
<xs:documentation xml:lang="en">All Schema files in the OpenTravel Alliance specification are made available according to the terms defined by the OpenTravel License Agreement at http://www.opentravel.org/Specifications/Default.aspx.</xs:documentation>
</xs:annotation>
<xs:simpleType name="AvailabilityStatusType">
<xs:annotation>
<xs:documentation xml:lang="en">Identifies the availability status of an item.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:NMTOKENS">
<xs:enumeration value="Open">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Close">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="ClosedOnArrival">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="ClosedOnArrivalOnRequest">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory may not be available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="OnRequest">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory may be available.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="RemoveCloseOnly">
<xs:annotation>
<xs:documentation xml:lang="en">Remove Close restriction while keeping other restrictions in place.</xs:documentation>
</xs:annotation>
</xs:enumeration>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="RatePlanEnum">
<xs:annotation>
<xs:documentation xml:lang="en">Identifies rate plan types.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:NMTOKENS">
<xs:enumeration value="Government">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Negotiated">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Preferred">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Other_">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory may not be available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
</xs:restriction>
</xs:simpleType>
假设我未使用的类型列表如下:myTypes = [RatePlanEnum]
这意味着我想删除整个 simpleType name=RatePlanEnum 节点。
我试过:
from lxml import etree
doc = etree.parse('myfile.xml')
for elem in doc.findall('.//xs:simpleType'):
parent = elem.getparent()
if(elem.attrib.get('name') = 'RatePlanEnum'):
parent.remove(elem)
我如何以编程方式执行此操作并在所有修改后吐出 xml?
解决方案
此文件使用命名空间xmlns:xs="http://www.w3.org/2001/XMLSchema"
,您必须使用{http://www.w3.org/2001/XMLSchema}
而不是xs:
infindall()
doc.findall('.//{http://www.w3.org/2001/XMLSchema}simpleType'):
文档:lxml - 命名空间
完整示例:
from lxml import etree
data = '''<!--W3C XML Schema generated by XMLSpy v2019 rel. 3 sp1 (x64) (http://www.altova.com)-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.opentravel.org/OTA/2003/05" targetNamespace="http://www.opentravel.org/OTA/2003/05" elementFormDefault="qualified">
<xs:annotation>
<xs:documentation xml:lang="en">All Schema files in the OpenTravel Alliance specification are made available according to the terms defined by the OpenTravel License Agreement at http://www.opentravel.org/Specifications/Default.aspx.</xs:documentation>
</xs:annotation>
<xs:simpleType name="AvailabilityStatusType">
<xs:annotation>
<xs:documentation xml:lang="en">Identifies the availability status of an item.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:NMTOKENS">
<xs:enumeration value="Open">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Close">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="ClosedOnArrival">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="ClosedOnArrivalOnRequest">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory may not be available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="OnRequest">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory may be available.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="RemoveCloseOnly">
<xs:annotation>
<xs:documentation xml:lang="en">Remove Close restriction while keeping other restrictions in place.</xs:documentation>
</xs:annotation>
</xs:enumeration>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="RatePlanEnum">
<xs:annotation>
<xs:documentation xml:lang="en">Identifies rate plan types.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:NMTOKENS">
<xs:enumeration value="Government">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Negotiated">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Preferred">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory is not available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
<xs:enumeration value="Other_">
<xs:annotation>
<xs:documentation xml:lang="en">Inventory may not be available for sale to arriving guests.</xs:documentation>
</xs:annotation>
</xs:enumeration>
</xs:restriction>
</xs:simpleType>
</xs:schema>'''
doc = etree.fromstring(data)
for elem in doc.findall('.//{http://www.w3.org/2001/XMLSchema}simpleType'):
parent = elem.getparent()
if elem.attrib.get('name') == 'RatePlanEnum':
parent.remove(elem)
print(etree.tostring(doc).decode())
推荐阅读
- javascript - JavaScript 的数组是数组还是列表?
- python - 理解嵌套的 defaultdict 和 `tree = lambda: defaultdict(tree)` vs `tree = defaultdict(lambda: tree)`
- c# - 使用身份服务器托管的 Blazor Webassembly 进行用户身份验证,如何对外部 api 调用进行身份验证?
- git - 在 git 中变基
- python - 使用 pip 安装的本地包虽然显示在 pip 列表中,但无法导入
- python - 使用 pywinauto 实现远程桌面连接自动化
- next.js - NextJS 页面未定义
- xml - XML 文档的正式定义
- python - How to create a pairwise DTW cost matrix?
- javascript - 显示/隐藏css(复选框)和js不恢复默认图片