python - python:解析 XML 字段
问题描述
使用下面的 Python3 脚本,我能够解析 XML 记录并将其转换为列表(通过从中提取值字段)。
请帮助改进它以使用 XML 记录中的名称“:”值打印。
例如:假设下面的一块
<field name="RecordType" value="RESGJG"/>
<field name="RecordTypeHEC" value="PY"/>
获得输出
RESGJG, PY
所需输出:
RecordType:RESGJG, RecordTypeHEC:PY
我的输入文件:dummy.xml(##请注意它有两条记录##每条记录都以记录源=“AJS/SHD”开头)
<?xml version="1.0" encoding="UTF-8"?>
<records>
<record source="AJS/SHD" type="call">
<group name="General">
<field name="RecordType" value="RESGJG"/>
<field name="RecordTypeHEC" value="PY"/>
<field name="NodeID" value="rock.dsjjgds.cm"/>
<field name="SequenceNumber" value="7937973"/>
<field name="StartDate" value="20171049979"/>
<field name="EndDate" value="201704059739793"/>
<field name="CallDuration" value="973979i"/>
<field name="CauseForRecordClosing" value="normal"/>
</group>
<group name="SIP">
<field name="ICID" value="dshhkdhs"/>
<field name="CallID" value="sdidydakyd2133@10.10.10.1"/>
<field name="User-Agent" value="NotPresent"/>
<field name="Request-URI" value="sip:+47668384"/>
<field name="CalledPartyNumber" value="sip:+08779379972"/>
<field name="CallingPartyNumber" value="sip:+07073873772@10.0.0.1"/>
<field name="To" value="sip:+878379739"/>
<field name="From" value="sip:+937973962"/>
</group>
<group name="VPN">
<field name="VPN_NAME_B" value="blshahd"/>
<field name="VPN_Group_B" value="ctr"/>
<field name="B_ExtType" value="part"/>
<field name="B_ISDN" value="7973"/>
<field name="B_SIP" value="67367672"/>
<field name="B_PABXID" value="797397"/>
</group>
</record>
<record source="AJS/SHD" type="call">
<group name="General">
<field name="RecordType" value="MESGJG"/>
<field name="RecordTypeHEC" value="DY"/>
<field name="NodeID" value="rock.dsjjgds.cm"/>
<field name="SequenceNumber" value="7937973"/>
<field name="StartDate" value="20171049979"/>
<field name="EndDate" value="201704059739793"/>
<field name="CallDuration" value="973979i"/>
<field name="CauseForRecordClosing" value="normal"/>
</group>
<group name="SIP">
<field name="ICID" value="dshhkdhs"/>
<field name="CallID" value="sdidydakyd2133@10.10.10.1"/>
<field name="User-Agent" value="NotPresent"/>
<field name="Request-URI" value="sip:+47668384"/>
<field name="CalledPartyNumber" value="sip:+08779379972"/>
<field name="CallingPartyNumber" value="sip:+07073873772@10.0.0.1"/>
<field name="To" value="sip:+878379739"/>
<field name="From" value="sip:+937973962"/>
</group>
<group name="VPN">
<field name="VPN_NAME_B" value="blshahd"/>
<field name="VPN_Group_B" value="ctr"/>
<field name="B_ExtType" value="part"/>
<field name="B_ISDN" value="7973"/>
<field name="B_SIP" value="67367672"/>
<field name="B_PABXID" value="797397"/>
</group>
</record>
</records>
我已经尝试过下面的脚本来解析 XML 字段并以列表格式打印。
import sys
import operator
from functools import reduce
from xml.etree.ElementTree import ElementTree
tree = ElementTree()
tree.parse("dummy.xml")
root = tree.getroot()
data = []
groups = root.findall('.//group')
for group in groups:
data.append([f.attrib['value'] for f in group.findall('./field')])
q = reduce(operator.concat, data)
s = ", ".join(q)
print(s)
获取输出为
RESGJG, PY, rock.dsjjgds.cm, 7937973, 20171049979, 201704059739793, 973979i, normal, dshhkdhs, sdidydakyd2133@10.10.10.1, NotPresent, sip:+47668384, sip:+08779379972, sip:+07073873772@10.0.0.1, sip:+878379739, sip:+937973962, blshahd, ctr, part, 7973, 67367672, 797397, MESGJG, DY, rock.dsjjgds.cm, 7937973, 20171049979, 201704059739793, 973979i, normal, dshhkdhs, sdidydakyd2133@10.10.10.1, NotPresent, sip:+47668384, sip:+08779379972, sip:+07073873772@10.0.0.1, sip:+878379739, sip:+937973962, blshahd, ctr, part, 7973, 67367672, 797397
所需输出:
RecordType:RESGJG, RecordTypeHEC:PY, NodeID:rock.dsjjgds.cm, SequenceNumber:7937973, StartDate:20171049979, EndDate:201704059739793, CallDuration:973979i, CauseForRecordClosing:normal, ICID:dshhkdhs, CallID:sdidydakyd2133@10.10.10.1, User-Agent:NotPresent, Request-URI:sip:+47668384, CalledPartyNumber:sip:+08779379972, CallingPartyNumber:sip:+07073873772@10.0.0.1, To:sip:+878379739, From:sip:+937973962, VPN_NAME_B:blshahd, VPN_Group_B:ctr, B_ExtType:part, B_ISDN:7973, B_SIP:67367672, B_PABXID:797397,
RecordType:MESGJG, RecordTypeHEC:DY, NodeID:rock.dsjjgds.cm, SequenceNumber:7937973, StartDate:20171049979, EndDate:201704059739793, CallDuration:973979i, CauseForRecordClosing:normal, ICID:dshhkdhs, CallID:sdidydakyd2133@10.10.10.1, User-Agent:NotPresent, Request-URI:sip:+47668384, CalledPartyNumber:sip:+08779379972, CallingPartyNumber:sip:+07073873772@10.0.0.1, To:sip:+878379739, From:sip:+937973962, VPN_NAME_B:blshahd, VPN_Group_B:ctr, B_ExtType:part, B_ISDN:7973, B_SIP:67367672, B_PABXID:797397,
请帮我
解决方案
您的代码仅获取value
属性,它完全忽略name
.
此外,使用reduce
有点矫枉过正。
groups = root.findall('.//group')
for group in groups:
print(', '.join('{}: {}'.format(field.attrib['name'], field.attrib['value']) for field in group.findall('./field')))
print()
将输出:
RecordType: RESGJG, RecordTypeHEC: PY, NodeID: rock.dsjjgds.cm, SequenceNumber: 7937973, StartDate: 20171049979, EndDate: 201704059739793, CallDuration: 973979i, CauseForRecordClosing: normal
ICID: dshhkdhs, CallID: sdidydakyd2133@10.10.10.1, User-Agent: NotPresent, Request-URI: sip:+47668384, CalledPartyNumber: sip:+08779379972, CallingPartyNumber: sip:+07073873772@10.0.0.1, To: sip:+878379739, From: sip:+937973962
VPN_NAME_B: blshahd, VPN_Group_B: ctr, B_ExtType: part, B_ISDN: 7973, B_SIP: 67367672, B_PABXID: 797397
RecordType: MESGJG, RecordTypeHEC: DY, NodeID: rock.dsjjgds.cm, SequenceNumber: 7937973, StartDate: 20171049979, EndDate: 201704059739793, CallDuration: 973979i, CauseForRecordClosing: normal
ICID: dshhkdhs, CallID: sdidydakyd2133@10.10.10.1, User-Agent: NotPresent, Request-URI: sip:+47668384, CalledPartyNumber: sip:+08779379972, CallingPartyNumber: sip:+07073873772@10.0.0.1, To: sip:+878379739, From: sip:+937973962
VPN_NAME_B: blshahd, VPN_Group_B: ctr, B_ExtType: part, B_ISDN: 7973, B_SIP: 67367672, B_PABXID: 797397
推荐阅读
- python - Python - 为什么'&'和'and'运算符提供不同的结果,尽管评估条件相同的结果
- javascript - jOuery DataTables:点击列名后数据可见
- python - 为什么参数不能在 pyplot 中以 x 和 y 的形式显式传递?
- maven - maven maven-assembly-plugin:定义文件顺序
- c++ - 在两个范围内按降序对向量进行排序
- java - 可以用不同的jdk配置wildfly 10应用服务器吗?
- python - 如何将标量值与 Numba (Python) 中的数组进行比较?
- python - Jupyter notebook 找不到我的 ssl 证书文件,除非我以 sudo 身份运行
- ios - 在iOS上卸载抵抗存储短字符串的地方?
- angular - 接收数据更新后强制 MatSnackBar 更新 UI