首页 > 解决方案 > 如何使用 python beautifulsoup 更新 xml 文件

问题描述

我有一个 xml 文件,我必须为其更新标签值。以下是文件内容

<annotation>
    <folder>train</folder>
    <filename>Arms1.jpg</filename>
    <path>D:\PyCharmWorkSpace\Tensorflow\workspace\training_demo\images\train\Arms1.jpg</path>
    <source>
        <database>Unknown</database>
    </source>
</annotation>

在上面的内容中,我必须path用新值更新值下面是我的代码:

from bs4 import BeautifulSoup
import os

curr_dir = os.path.join(os.path.dirname(__file__), 'img')
files = os.listdir(curr_dir)

for file in files:
    if ".xml" in file:
        file_path = os.path.join(curr_dir, file)
        with open(file_path, 'r') as f:
            xml_data = f.read()
        bs_data = BeautifulSoup(xml_data, "xml")
        bs_data.path.string = "C:\"
        xml_file = open(file_path, "w")
        xml_file.write(bs_data.prettify())

但它没有在 xml 文件中更新。任何人都可以请帮忙。谢谢

标签: pythonxmlbeautifulsoup

解决方案


使用 ElementTree(不需要任何外部库)

import xml.etree.ElementTree as ET


xml = '''<annotation>
    <folder>train</folder>
    <filename>Arms1.jpg</filename>
    <path>D:\PyCharmWorkSpace\Tensorflow\workspace\training_demo\images\train\Arms1.jpg</path>
    <source>
        <database>Unknown</database>
    </source>
</annotation>'''

root = ET.fromstring(xml)
root.find('path').text = 'new path value goes here'
ET.dump(root)

输出

<annotation>
    <folder>train</folder>
    <filename>Arms1.jpg</filename>
    <path>new path value goes here</path>
    <source>
        <database>Unknown</database>
    </source>
</annotation>

推荐阅读