首页 > 解决方案 > Python如何根据CSV排序订购Xml文件?

问题描述

我有这个 XML 文件:- final.xml

<?xml version="1.0"?>
<TestSuite Name="DM123">
  <Group Name="TestRoot" ExecutionPolicy="AnyDeviceAnyOrder">
    <Parameters>
      <Parameter Type="Integer" Name="maxA" Value="1" />
      <Parameter Type="Integer" Name="MaxB" Value="120" />
      <Parameter Type="String" Name="MaxC" Value="integration" />
    </Parameters>
    <Children>
      <Test Name="Test1" Namespace="TestCases">
        <Parameters>
           <Parameter Type="Device" Name="Device">
             <Requirements>
               <Requirement TypeId="a76" Source="User" />
               <Requirement TypeId="2c9" Source="User" />
             </Requirements>
           </Parameter>
        </Parameters>
      </Test>
      <Test Name="Test5" Namespace="TestCases">
        <Parameters>
           <Parameter Type="Dev" Name="Dev">
               <Requirements>
                 <Requirement TypeId="a76" Source="User" />
                 <Requirement TypeId="2c9" Source="User" />
               </Requirements>
           </Parameter>
        </Parameters>
      </Test>
      <Test Name="Test6" Namespace="TestCases">
            <Parameters>
              <Parameter Type="Dev" Name="Dev">
                <Requirements>
                  <Requirement TypeId="a76" Source="User" />
                  <Requirement TypeId="2c9" Source="User" />
                </Requirements>
              </Parameter>
              <Parameter Type="Integer" Name="expected amount of images" Value="10" />
            </Parameters>
      </Test>
   </Children>
  </Group>
  <Models>
    <Model Name="DD1" />
  </Models>
</TestSuite>

xml 文件包含测试用例名称及其详细信息。这只是一个示例,但我在 xml 文件中有数千个这样的测试用例。

我有一个 CSV 文件,它也包含相同的测试用例,但以有序的方式。订单是使用一些特定参数定义的,例如首先处理最短处理时间:-

Id TestName
0    Test5
1    Test1
3    Test6

我想根据 CSV 文件订购 xml 文件。我是python的初学者。我怎么能在python中做到这一点?提前致谢

期望的输出:-

<?xml version="1.0"?>
<TestSuite Name="DM123">
  <Group Name="TestRoot" ExecutionPolicy="AnyDeviceAnyOrder">
    <Parameters>
      <Parameter Type="Integer" Name="maxA" Value="1" />
      <Parameter Type="Integer" Name="MaxB" Value="120" />
      <Parameter Type="String" Name="MaxC" Value="integration" />
    </Parameters>
    <Children>
      <Test Name="Test5" Namespace="TestCases">
        <Parameters>
           <Parameter Type="Dev" Name="Dev">
               <Requirements>
                 <Requirement TypeId="a76" Source="User" />
                 <Requirement TypeId="2c9" Source="User" />
               </Requirements>
           </Parameter>
        </Parameters>
      </Test>
      <Test Name="Test1" Namespace="TestCases">
        <Parameters>
           <Parameter Type="Device" Name="Device">
             <Requirements>
               <Requirement TypeId="a76" Source="User" />
               <Requirement TypeId="2c9" Source="User" />
             </Requirements>
           </Parameter>
        </Parameters>
      </Test>
      <Test Name="Test6" Namespace="TestCases">
            <Parameters>
              <Parameter Type="Dev" Name="Dev">
                <Requirements>
                  <Requirement TypeId="a76" Source="User" />
                  <Requirement TypeId="2c9" Source="User" />
                </Requirements>
              </Parameter>
              <Parameter Type="Integer" Name="expected amount of images" Value="10" />
            </Parameters>
      </Test>
   </Children>
  </Group>
  <Models>
    <Model Name="DD1" />
  </Models>
</TestSuite>

标签: pythonpandasxmlcsv

解决方案


一种可能的方法是使用 BeautifulSoup 来帮助进行 XML 解析。这可以使用以下方式安装:

pip install beautifulsoup4

编码:

from bs4 import BeautifulSoup
import csv
import glob


tests = {}

# Read in the XML files and parse them

for xml_filename in glob.glob('*.xml'):
    print(f'Parsing {xml_filename}')
    
    with open(xml_filename) as f_input:
        soup = BeautifulSoup(f_input, 'xml')
        
    # Store all test tags in a dictionary 
    # (this assumes each test has a unique name)

    for test in soup.find_all('Test'):
        tests[test['Name']] = test

# Locate the children tag and empty it

children = soup.find('Children')
children.clear()

# Read in the ordering CSV and append the tests in the required order into
# the children tag

with open('order.csv', newline='') as f_order:
    csv_order = csv.reader(f_order)
    header = next(csv_order)
    
    for id, test_name in csv_order:
        try:
            children.append(tests.pop(test_name))
        except KeyError:
            print(f"Test '{test_name}' not present")
 
# Add any remaining tests not listed in the CSV
    
for test in tests.values():
    children.append(test)
    
# Write the modified xml

with open('output.xml', 'w') as f_output:
    f_output.write(str(soup))

XML 的缩进将被修改,但结构将保持不变。通过将输出写入更改为:

f_output.write(soup.prettify())

它首先将所有 XML 文件中的所有测试存储在 Python 字典中。然后在读取您的订购 CSV 文件时,它使用 pop 从字典中读取每个测试并将其附加到子标签中。通过使用pop()它还可以从字典中删除测试。然后可以将字典中剩余的任何测试(不在排序 CSV 中)添加到末尾(如果需要)。


推荐阅读