python - 使用 Python 解析 XML 文档
问题描述
我有一个相当复杂的 XML 文档,至少对我来说,上面有几个信息,我尝试检查 lxml 库以完成任务,但我遇到了困难。
我拥有的 XML 文档与下面的非常相似:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
<measCollecFile
xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
<fileHeader fileFormatVersion="32.435 V8.0.0"
vendorName="Nokia">
<fileSender
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
elementType="pgw instance 1" />
<measCollec beginTime="2019-05-14T12:00:01-03:00" />
</fileHeader>
<measData>
<managedElement
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
swVersion="C-10.0.R9" />
<measInfo measInfoId="KPISystemCP-ISA">
<granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
<measType p="1">VS.avgCpuUtilization</measType>
<measType p="2">VS.avgMemoryUtilization</measType>
<measType p="3">VS.avgMemoryUtilization1M</measType>
<measType p="4">VS.SDFsFpUtilization</measType>
<measType p="5">VS.SDFsLcpUtilization</measType>
<measType p="6">VS.avgVmFpCpuNicUsage</measType>
<measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
<measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
<measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
<measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
<measType p="11">VS.hwCfgBitsInfo</measType>
<measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
<r p="1">1</r>
<r p="2">72</r>
<r p="3">72</r>
<r p="4">0.00</r>
<r p="5">0.00</r>
<r p="6">0.00</r>
<r p="7">0.05</r>
<r p="8">0.00</r>
<r p="9">0.00</r>
<r p="10">0.00</r>
<r p="11">4</r>
<suspect>false</suspect>
</measValue>
</measInfo>
我想知道如何使用 python 访问 VS.avgMemoryUtilization1M 的值。
我知道 VS.avgMemoryUtilization1M 的值为 72,但是如何使用 lxml 库从 python 访问它?
解决方案
您可以BeautifulSoup
用来解析 XML 数据(优点是您可以使用 CSS 选择器,XML 可能格式错误等):
from bs4 import BeautifulSoup
data = ''' <?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
<measCollecFile
xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
<fileHeader fileFormatVersion="32.435 V8.0.0"
vendorName="Nokia">
<fileSender
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
elementType="pgw instance 1" />
<measCollec beginTime="2019-05-14T12:00:01-03:00" />
</fileHeader>
<measData>
<managedElement
localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
swVersion="C-10.0.R9" />
<measInfo measInfoId="KPISystemCP-ISA">
<granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
<measType p="1">VS.avgCpuUtilization</measType>
<measType p="2">VS.avgMemoryUtilization</measType>
<measType p="3">VS.avgMemoryUtilization1M</measType>
<measType p="4">VS.SDFsFpUtilization</measType>
<measType p="5">VS.SDFsLcpUtilization</measType>
<measType p="6">VS.avgVmFpCpuNicUsage</measType>
<measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
<measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
<measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
<measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
<measType p="11">VS.hwCfgBitsInfo</measType>
<measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
<r p="1">1</r>
<r p="2">72</r>
<r p="3">72</r>
<r p="4">0.00</r>
<r p="5">0.00</r>
<r p="6">0.00</r>
<r p="7">0.05</r>
<r p="8">0.00</r>
<r p="9">0.00</r>
<r p="10">0.00</r>
<r p="11">4</r>
<suspect>false</suspect>
</measValue>
</measInfo>'''
soup = BeautifulSoup(data, 'xml')
p = soup.select_one('measType[p]:contains("VS.avgMemoryUtilization1M")')['p']
print('Value of `VS.avgMemoryUtilization1M`={}'.format(soup.select_one('r[p="{}"]'.format(p)).text))
印刷:
Value of `VS.avgMemoryUtilization1M`=72
推荐阅读
- flutter - Flutter StatefulWidget with parameters
- encryption - 使用 xmlCipher.doFinal() 解密 SAML 响应时未获得解密密码数据的预期结果
- jquery - 使用 jquery 设置和获取 html 标记值
- c++ - 从 .SVG 文件创建后 QIcon null
- azure - 我们如何在 datafactory 中创建一个通用的映射数据流,它将动态地从具有不同模式的不同表中提取数据?
- azure-devops - Azure 上的管道中是否缺少任何内容?
- javascript - How do I dynamically get the values from Objects in Javascript?
- ios - 滚动时如何隐藏我的搜索栏?
- signalr - 将信号器集线器连接详细信息存储在本地存储中
- python - boto3在读取s3文件时对字符串替换工作很奇怪