首页 > 解决方案 > 无法引用 xml 文件的元素

问题描述

我一直在玩弄一些政府数据,试图提取他们通过 python 提供的 xml 文件的元素

下面是其中一种产品的示例——基本上,我试图做的是从 HospitalCover 中提取数据,即是否涵盖了每项特定服务。

<Product
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" ProductCode="J7/WAYB20" ProductItemID="024D633D-C67E-4609-B51D-DB7D4DD3D8A5" ProductID="95ddc834-6cce-4c6e-a28a-020cadd22b40" FundItemID="63328fd4-b6b7-441e-a4fe-bb1df2e3eedf" Status="Published" StatusDate="2020-10-01T00:00:55.6" DateModified="2020-08-10T11:35:41.473" DateCreated="2020-08-10T11:35:41.3746927" DateApproved="2020-08-10T11:45:41.283" PublishDate="01-Oct-20 00:00" SchemaVersion="3.0"
xmlns="http://admin.privatehealth.gov.au/ws/Schemas" xsi:schemaLocation="http://admin.privatehealth.gov.au/ws/Schemas PHOLSchema-V3.0.xsd">
<FundCode>CBH</FundCode>
<ProductGroupCode>J7</ProductGroupCode>
<Name>LiveLife (Gold)</Name>
<ProductType>Combined</ProductType>
<ProductURL xsi:nil="true" />
<PHISURL xsi:nil="true" />
<FundsProductCode xsi:nil="true" />
<ProductStatus>Closed</ProductStatus>
<Corporate Atomic="true" />
<OnlyAvailableWith Atomic="true">
    <NotApplicable xsi:nil="true" />
</OnlyAvailableWith>
<DateValidFrom xsi:nil="true" />
<DateValidTo xsi:nil="true" />
<DateIssued>2020-10-01</DateIssued>
<State>WA</State>
<Scale>Couple</Scale>
<Excesses ExcessType="None" Atomic="true" />
<CoPayments CoPaymentType="Limited" Atomic="true">
    <Shared>70</Shared>
    <SharedMax>420</SharedMax>
    <Private>70</Private>
    <PrivateMax>420</PrivateMax>
    <DaySurgery>70</DaySurgery>
    <AnnualMax>840</AnnualMax>
</CoPayments>
<MedicareLevySurchargeExempt>true</MedicareLevySurchargeExempt>
<PremiumNoRebate>602.9</PremiumNoRebate>
<PremiumHospitalComponent>401.93</PremiumHospitalComponent>
<AddOns />
<Brands />
<HospitalCover AccidentCover="true" BasedOnID="">
    <HospitalTier>Gold</HospitalTier>
    <AgeBasedDiscount Available="true" AvailableForTransferee="true" />
    <Accommodation>PrivateOrPublic</Accommodation>
    <HospitalPercent xsi:nil="true" />
    <LimitHospitalDays xsi:nil="true" />
    <MedicalServices>
        <MedicalService Title="AssistedReproductive" Cover="Covered" />
        <MedicalService Title="BackNeckSpine" Cover="Covered" />
        <MedicalService Title="Blood" Cover="Covered" />
        <MedicalService Title="BoneJointMuscle" Cover="Covered" />
        <MedicalService Title="BrainNervousSystem" Cover="Covered" />
        <MedicalService Title="BreastSurgery" Cover="Covered" />
        <MedicalService Title="Cataracts" Cover="Covered" />
        <MedicalService Title="ChemotherapyRadiotherapyImmunotherapy" Cover="Covered" />
        <MedicalService Title="DentalSurgery" Cover="Covered" />
        <MedicalService Title="Diabetes" Cover="Covered" />
        <MedicalService Title="Dialysis" Cover="Covered" />
        <MedicalService Title="DigestiveSystem" Cover="Covered" />
        <MedicalService Title="EarNoseThroat" Cover="Covered" />
        <MedicalService Title="Eye" Cover="Covered" />
        <MedicalService Title="GastrointestinalEndoscopy" Cover="Covered" />
        <MedicalService Title="Gynaecology" Cover="Covered" />
        <MedicalService Title="HeartVascular" Cover="Covered" />
        <MedicalService Title="HerniaAppendix" Cover="Covered" />
        <MedicalService Title="HospitalPsychiatric" Cover="Covered" />
        <MedicalService Title="ImplantationHearingDevices" Cover="Covered" />
        <MedicalService Title="InsulinPumps" Cover="Covered" />
        <MedicalService Title="JointReconstructions" Cover="Covered" />
        <MedicalService Title="JointReplacements" Cover="Covered" />
        <MedicalService Title="KidneyBladder" Cover="Covered" />
        <MedicalService Title="LungChest" Cover="Covered" />
        <MedicalService Title="MaleReproductive" Cover="Covered" />
        <MedicalService Title="MiscarriageTerminationOfPregnancy" Cover="Covered" />
        <MedicalService Title="PainManagement" Cover="Covered" />
        <MedicalService Title="PainManagementWithDevice" Cover="Covered" />
        <MedicalService Title="PalliativeCare" Cover="Covered" />
        <MedicalService Title="PlasticReconstructiveSurgery" Cover="Covered" />
        <MedicalService Title="PodiatricSurgery" Cover="Covered" />
        <MedicalService Title="PregnancyBirth" Cover="Covered" />
        <MedicalService Title="Rehabilitation" Cover="Covered" />
        <MedicalService Title="Skin" Cover="Covered" />
        <MedicalService Title="SleepStudies" Cover="Covered" />
        <MedicalService Title="TonsilsAdenoidsGrommets" Cover="Covered" />
        <MedicalService Title="WeightLossSurgery" Cover="Covered" />
    </MedicalServices>
    <WaitingPeriods>
        <WaitingPeriod Unit="Month" Atomic="true" Title="SubAcute">2</WaitingPeriod>
        <WaitingPeriod Unit="Month" Atomic="true" Title="PreExisting">12</WaitingPeriod>
        <WaitingPeriod Unit="Month" Atomic="true" Title="PregnancyBirth">12</WaitingPeriod>
        <WaitingPeriod Unit="Month" Atomic="true" Title="Other">2</WaitingPeriod>
    </WaitingPeriods>
    <OtherProductFeatures>Co-payments do not apply for any dependant children on the policy. Gap Assist benefit of $200 per person per calendar year.</OtherProductFeatures>
</HospitalCover>
<GeneralHealthCover BasedOnID="">
...

问题是我不明白如何引用它。例如,如果我跑了root[0][21].tag,我就回来了{http://admin.privatehealth.gov.au/ws/Schemas}Brands。但是,运行root[0][22].tagwhich is what id expect to return HospitalCover 实际上会返回{http://admin.privatehealth.gov.au/ws/Schemas}GeneralHealthCover,完全跳过了 hosptialCover ,我想知道为什么会这样/我如何才能引用这些元素。

编辑:

对于任何特定的医疗服务,即“辅助生殖”,我希望能够查看其是否被覆盖。

标签: python-3.xxml

解决方案


有几种方法可以解决它,但是由于您正在处理 xml,因此最有效的方法是使用 xpath:

from lxml import etree
services = """[your xml above]"""

doc = etree.XML(services)
for service in doc.xpath('//*[local-name()="MedicalService"]'):
    print(service.attrib['Title'],": ",service.attrib['Cover'])

输出 [注意:在你的 xml 中,所有服务都被覆盖]

AssistedReproductive :  Covered
BackNeckSpine :  Covered
Blood :  Covered
BoneJointMuscle :  Covered
BrainNervousSystem :  Covered
BreastSurgery :  Covered

等等


推荐阅读