首页 > 解决方案 > 在python中使用lxml添加元素后保留XML结构

问题描述

首先,我是堆栈溢出的新手,所以如果有空间的话,请善待并就如何改进这个问题提出建设性的批评。

问题: 我想保留下面代码创建的 XML 文件的结构。我希望它看起来像这样:

<?xml version="1.0" encoding="UTF-8" ?>
<save>
    <header version="2" />
    <version major="3" minor="6" revision="2" build="0" />
    <region id="ModuleSettings">
        <node id="root">
            <children>
                <node id="ModOrder">
                    <children>
                        <node id="Module">
                            <attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" />
                        </node>
                        <node id="Module">
                            <attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" />
                        </node>
                    </children>
                </node>
                <node id="Mods">
                    <children>
                        <node id="ModuleShortDesc">
                            <attribute id="Folder" value="somestuff1" type="30" />
                            <attribute id="MD5" value="somestuff1" type="23" />
                            <attribute id="Name" value="somestuff1" type="22" />
                            <attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" />
                            <attribute id="Version" value="1" type="4" />
                        </node>
                        <node id="ModuleShortDesc">
                            <attribute id="Folder" value="somestuff2" type="30" />
                            <attribute id="MD5" value="" type="23" />
                            <attribute id="Name" value="somestuff2" type="22" />
                            <attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" />
                            <attribute id="Version" value="2" type="4" />
                        </node>
                    </children>
                </node>
            </children>
        </node>
    </region>
</save>

而是得到这个:

<?xml version="1.0" encoding="UTF-8" ?>
<save>
    <header version="2" />
    <version major="3" minor="6" revision="2" build="0" />
    <region id="ModuleSettings">
        <node id="root">
            <children>
                <node id="ModOrder">
                    <children>
                    <node id="Module"><attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" /></node><node id="Module"><attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" /></node></children>
                </node>
                <node id="Mods">
                    <children>
                    <node id="ModuleShortDesc"><attribute id="Folder" value="somestuff1" type="30" /><attribute id="MD5" value="somestuff1" type="23" /><attribute id="Name" value="somestuff1" type="22" /><attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" /><attribute id="Version" value="1" type="4" /></node><node id="ModuleShortDesc"><attribute id="Folder" value="somestuff2" type="30" /><attribute id="MD5" value="" type="23" /><attribute id="Name" value="somestuff2" type="22" /><attribute id="UUID" value="7e737d2f-31d2-4751-963f-be6ccc59cd0c" type="22" /><attribute id="Version" value="2" type="4" /></node></children></node>
            </children>
        </node>
    </region>
</save>

仅关注 ModOrder 节点,这是我当前的代码:

# Create a Module element as object:
def new_module(uuid, ModOrder):

    ''' Example Module:
        <node id="Module">
            <attribute id="UUID" value="627f624f-e2b8-4b37-977e-03044e500fec" type="22" />
        </node>
    '''

    uuid = str(uuid)

    module = et.SubElement(ModOrder, "node")
    module.set("id", "Module")

    attribute_uuid = et.SubElement(module, "attribute")
    attribute_uuid.set("id", "UUID")
    attribute_uuid.set("value", uuid)
    attribute_uuid.set("type", "22")

    return module

def generator2():

    # mods_dictionary returns 2 lists of dictionaries:
    info = mods_dictionary(a1)

    # info[0] contains a list of dictionaries.
    # Each dictionary contains information of each mod pulled from meta.lsx file inside each pak
    data_list = info[0]
    # error_list = info[1] # Not needed

    # ModOrderTree = element tree object @ <node id="Module">
    ModOrderTree = tree.xpath('//node[@id="ModOrder"]')[0]

    # ModOrder = element tree object @ <children>   
    ModOrder = ModOrderTree.find('children')

    # For each dictionary inside data_list
    for mods in data_list:
        order = new_module(mods["UUID"],ModOrder)
        desc = new_moduleshortdesc(mods["Name"], mods["Author"], mods["Version"], mods["UUID"], mods["Folder"])

    # Then write to file:    
    tree.write('testwrite.xml')

generator2()

没有办法实现我想要的?

请记住,我是编程新手,还在学习很多东西,所以我确信有更多的 Python 方法可以更有效地编写代码。如果我做了任何无聊的事情来打扰你,请随时提出建议:p

尝试的事情:

    t1 = et.tostring(tree, encoding="unicode",method="xml",pretty_print=True)
    with open(test_file,'w') as f:
        f.write(t1)
tree.write('testwrite.xml', pretty_print=True). 

标签: xmlpython-3.xlxmlelementtree

解决方案


解决方案:(感谢 woodm1979 Python pretty XML printer with lxml

只需从整个文档中删除所有空格,然后让解析器正确重新格式化它:

def reformat(file):
    generator2()

    parser = et.XMLParser(remove_blank_text=True)
    tree = et.parse(test_file,parser)
    tree.write(test_file, encoding='utf-8',pretty_print=True,xml_declaration=True)

推荐阅读