首页 > 解决方案 > 从其他属性创建新的 xml 属性

问题描述

我有以下 XML

<icim source="source">
  <object class="class_name" name="class_name">
    <attribute name="Type">
      <string>Type_Name</string>
</attribute>
    <attribute name="DisplayName">
      <string>DisplayName</string>
</attribute>
    <attribute name="Vendor">
      <string>Vendor_Name</string>
</attribute>
    <attribute name="Model">
      <string>Model_Name</string>
</attribute>
    <attribute name="Description">
      <string>Description_part1, Description_part2, Description_part3, Description_part4, Description_part5</string>
</attribute>
</object>
  <object class="class_name" name="class_name">
    <attribute name="Type">
      <string>Type_Name</string>
</attribute>
    <attribute name="DisplayName">
      <DisplayName</string>
</attribute>
    <attribute name="Vendor">
      <string>Vendor_Name</string>
</attribute>
    <attribute name="Model">
      <string>Model_Name</string>
</attribute>
    <attribute name="Description">
      <string>Description_part1, Description_part2, Description_part3, Description_part4, Description_part5</string>
</attribute>
</object>
.
.
.
</icim>

我想使用 Python 的元素树将其转换为:

<icim source="source">
  <object class="class_name" name="class_name">
    <attribute name="Type">
      <string>Type_Name</string>
</attribute>
    <attribute name="DisplayName">
      <string>DisplayName</string>
</attribute>
    <attribute name="Vendor">
      <string>Vendor_Name</string>
</attribute>
    <attribute name="Model">
      <string>Model_Name</string>
</attribute>
    <attribute name="String1">
      <string>Description_part1</string>
</attribute>
</attribute>
    <attribute name="String2">
      <string>Description_part2</string>
</attribute>
</attribute>
    <attribute name="String3">
      <string>Description_part3</string>
</attribute>
    <attribute name="Description">
      <string>Description_part1, Description_part2, Description_part3, Description_part4, Description_part5</string>
</attribute>
</object>
  <object class="class_name" name="class_name">
    <attribute name="Type">
      <string>Type_Name</string>
</attribute>
    <attribute name="DisplayName">
      <DisplayName</string>
</attribute>
    <attribute name="Vendor">
      <string>Vendor_Name</string>
</attribute>
    <attribute name="Model">
      <string>Model_Name</string>
</attribute>
</attribute>
    <attribute name="String1">
      <string>Description_part1</string>
</attribute>
</attribute>
    <attribute name="String2">
      <string>Description_part2</string>
</attribute>
</attribute>
    <attribute name="String3">
      <string>Description_part3</string>
</attribute>
    <attribute name="Description">
      <string>Description_part1, Description_part2, Description_part3, Description_part4, Description_part5</string>
</attribute>
</object>
.
.
.
</icim>

也就是说,我想从每个描述元素中提取前三个字符串部分(描述总是有逗号,因此您可以根据这些部分拆分部分)并为前 3 个描述部分中的每一个创建一个新属性。想法?

标签: pythonxmlparsingtreeelement

解决方案


您的 xml 和预期的 xml 格式不正确(<DisplayName</string>应该是 <string>DisplayName</string>),但假设它是固定的,如果我理解正确,以下内容至少可以帮助您:

from lxml import etree
display = """[your xml above, corrected]"""
doc = etree.XML(display)

objs = doc.xpath("//object")
for obj in objs:    
    news = obj.xpath('.//attribute[@ name="Description"]/string/text()')[0].split(',')[:3]
    counter=3 
    for new in reversed(news): #this list needs to be reversed to get the new elements into the xml in the correct order
        ins =  etree.fromstring(f'<attribute name="String{counter}">\n      <string>{new.strip()}</string>\n</attribute>\n')
        obj.insert(4,ins)    
        counter-=1 #same reason for counting in reverse
print(etree.tostring(doc).decode())

输出应该是您预期的输出。


推荐阅读