python - Creating a Nested XML document in Python
问题描述
An occasional scripter, I've scoured this forum and it has taken me so far but I'm stuck so looking for help. I am trying to create an XML document from a CSV structure and the aim is to have something that takes something that looks like this:
ID,Type,Currency,Notional,Underlying,Maturity Date,Representation Type
ID1,COMMIT,EUR,100,,2018-06-01,Bond
ID2,COMMIT,AUD,110,,2018-03-25,Stock
and transforms it to look like this.
<tradeRequests>
<tradeRequest>
<id>ID1</id>
<newDeals size="1">
<deal>
<id>ID1</id>
<terms>
<id>ID1</id>
<MaturityDate>2018-06-01</MaturityDate>
</terms>
</deal>
</newDeals>
</tradeRequest>
<tradeRequest>
<id>ID2</id>
<newDeals size="1">
<deal>
<id>ID2</id>
<terms>
<id>ID2</id>
<MaturityDate>2018-06-01</MaturityDate>
</terms>
</deal>
</newDeals>
</tradeRequest>
</tradeRequests>
The problem is my script doesn't seem to be formatting the items in the correct way because every row should essentially be a tradeRequest but I don't see that format.
Here is the snippet of my code, which will extract a subset of columns from a much larger number of columns.
import csv
import xml.etree.ElementTree as ET
import xml.dom.minidom
tradeRequests = ET.Element("tradeRequests")
tradeRequest = ET.SubElement(tradeRequests, "tradeRequest")
newDeals = ET.SubElement(tradeRequest, "newDeals")
deal = ET.SubElement(newDeals, "deal")
dealid = ET.SubElement(deal, "id")
with open('TestCase.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
ET.SubElement(tradeRequest, "id").text = row['ID']
ET.SubElement(tradeRequest, "newDeals", {'size':"1"} )
ET.SubElement(dealid, "id").text = row['ID']
ET.SubElement(dealid, "maturityDate").text = row['Maturity Date']
tree = ET.ElementTree(tradeRequests)
tree.write("Testcase.xml" )
xml = xml.dom.minidom.parse('Testcase.xml')
pretty_xml_as_string = xml.toprettyxml()
print pretty_xml_as_string
The problem is I can't seem to nest the items properly. I've tried creating a parent/child combination but this hasn't been successful. Instead, based on that code I see an output that looks like this.
<tradeRequests>
<tradeRequest>
<newDeals>
<deal>
<id>
<id>ID1</id>
<maturityDate>2018-06-01</maturityDate>
<id>ID2</id>
<maturityDate>2018-03-25</maturityDate>
</id>
</deal>
</newDeals>
<id>ID1</id>
<newDeals size="1"/>
<id>ID2</id>
<newDeals size="1"/>
</tradeRequest>
</tradeRequests>
Any help appreciated as always.
I hadn't anticipated this usercase where I need to loop and create elements dynamically
ID1,COMMIT,EUR,100,,2018-06-01,Bond
ID2,110,2018-03-25,Stock
ID2,110,2018-03-26,A
ID2,110,2018-03-26,B
ID2,110,2018-03-26,C
So in effect I need to create an element that will loop through the ID2 and dynamically create a new element depending on how many rows there are, which is unknown.
so my expected results will be something like
<tradeRequests>
<ids>
<id>ID1</id>
<element>
<maturityDate>2018-06-01</maturityDate>
<type>Stock</type
<element>
</id>
<id>ID2</id>
<element>
<maturityDate>2018-03-25</maturityDate>
<type>A</type>
</element>
<element>
<maturityDate>2018-03-25</maturityDate>
<type>B</type>
</element>
<maturityDate>2018-03-25</maturityDate>
<type>C</type>
</element>
</id>
</tradeRequests>
解决方案
我强烈建议使用优秀的lxml
库。它非常快,因为它是基于 C 库 libxml2 的包装器,并且它包含元素构建器对象E
,这使您的工作变得非常容易:
import csv
import lxml.etree
from lxml.builder import E
with open('TestCase.csv') as csvfile:
results = E.tradeRequests(*(
E.tradeRequest(
E.id(row['ID']),
E.newDeals(
E.deal(
E.id(row['ID']),
E.terms(
E.id(row['ID']),
E.MaturityDate(row['Maturity Date']),
)
),
size="1",
)
) for row in csv.DictReader(csvfile))
)
print(lxml.etree.tostring(results, pretty_print=True))
结果:
<tradeRequests>
<tradeRequest>
<id>ID1</id>
<newDeals size="1">
<deal>
<id>ID1</id>
<terms>
<id>ID1</id>
<MaturityDate>2018-06-01</MaturityDate>
</terms>
</deal>
</newDeals>
</tradeRequest>
<tradeRequest>
<id>ID2</id>
<newDeals size="1">
<deal>
<id>ID2</id>
<terms>
<id>ID2</id>
<MaturityDate>2018-03-25</MaturityDate>
</terms>
</deal>
</newDeals>
</tradeRequest>
</tradeRequests>
推荐阅读
- c# - 无法使用 C# Rest Sharp 反序列化 Json
- python - 为什么我的 pip3 Ansible 安装失败?在 /tmp/pip_build_root/cryptography 中失败,错误代码为 1
- regex - 用这些特殊字符(例如方括号 [] 和破折号 / \)分割字符串的正则表达式是什么?
- tcl - 如何制作文本掩码底层小部件?
- javascript - 为什么每个 Redux Middleware 都可以调用 next(action),它不会多次 dispatch 一个 action 吗?
- node.js - 如何使用 sequelize 重置 autoIncrement 主键?
- r - 删除背景颜色也会删除轴线
- git - 无法使用 crontab 运行 git 命令
- ios - 我可以将整个 iOS 应用程序打包为框架吗?
- javascript - 如何将数组值作为对象键传递,并将该键与对象中的某些值相关联。然后将其传递给列表项