首页 > 解决方案 > AttributeError: 'NoneType' 对象没有属性 'text' (在同一位置找到另一个标签很好,但找不到我需要的标签)

问题描述

我有一个 xml 文件,我试图从中提取信息。以下是文件的摘录

<?xml version="1.0" encoding="UTF-8"?>
<Terms>
    <Term>
        <Title>.177 (4.5mm) Airgun</Title>
        <Description>The standard airgun calibre for international target shooting.</Description>
        <RelatedTerms>
            <Term>
                <Title>Shooting sport equipment</Title>
                <Relationship>Narrower Term</Relationship>
            </Term>
        </RelatedTerms>
    </Term>
    <Term>
        <Title>.22</Title>
        <Description>A rimfire calibre, much used in target shooting and often synonymous with the term smallbore.</Description>
        <RelatedTerms>
            <Term>
                <Title>Shooting sport equipment</Title>
                <Relationship>Narrower Term</Relationship>
            </Term>
        </RelatedTerms>
    </Term>
    <Term>
        <Title>.22 Long Rifle</Title>
        <Description>The standard .22 rimfire cartridge for target rifle and pistol use.</Description>
        <RelatedTerms>
            <Term>
                <Title>Shooting sport equipment</Title>
                <Relationship>Narrower Term</Relationship>
            </Term>
        </RelatedTerms>
    </Term>
    <Term>
        <Title>.22 Short</Title>
        <Description>Used as a target shooting round for timed fire pistol competitions.</Description>
        <RelatedTerms>
            <Term>
                <Title>Shooting sport equipment</Title>
                <Relationship>Narrower Term</Relationship>
            </Term>
        </RelatedTerms>
    </Term>
</Terms>

下面是我写的代码。如果我注释掉"Title": rterm.find('Title').text,整个代码运行并完全执行它需要做的事情......但是,当我尝试运行包含该部分的代码时,我得到AttributeError: 'NoneType' object has no attribute 'text' 。

import json
import lxml
from bs4 import BeautifulSoup

xml_file = open('xml.xml', encoding='UTF-8') 
soup = BeautifulSoup(xml_file, 'lxml-xml', from_encoding='UTF-8')

Terms = soup.select('Terms > Term')  
jsonObj = {"thesaurus": []} 


for term in Terms: 
    termDetail = { 
        "Description": term.find('Description').text, 
        "Title": term.find('Title').text extracted 
    }
    RelatedTerms = term.select('RelatedTerms > Term')
    if RelatedTerms:
        termDetail["RelatedTerms"] = []
        for rterm in RelatedTerms:
            termDetail["RelatedTerms"].append({
                "Title": rterm.find('Title').text,
                "Relationship": rterm.find('Relationship').text
            })
    jsonObj["thesaurus"].append(termDetail)

with open(r'D:\UNI\data_science\fit5196\assessment_1b\json.dat', 'w') as json_file:
    json.dump(jsonObj, json_file, indent = 3, sort_keys = True ) 

我真的不知道为什么会出现这个错误。文本中有文本信息,它发现关系标签很好。我无计可施

任何和所有的建议表示赞赏

标签: pythonjsonxmlbeautifulsoupxml-parsing

解决方案


推荐阅读