首页 > 解决方案 > BeautifulSoup 找不到包含内容的元标记

问题描述

我可以看到<meta data-react-helmet="true" name="twitter:title" content="Mamma Mia! Here We Go Again Is the Only Good Thing About This Summer - Vogue"/>存在,但是此代码似乎无法正确获取此信息:

url = 'https://www.vogue.com/article/mamma-mia-2-here-we-go-again-review?mbid=social_twitter'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0'}
response = requests.get(url, headers=headers)

soup = BeautifulSoup(response.text, "lxml")

title = soup.find("meta",  {"name": "twitter:title"})
title2 = soup.find("meta",  property="og:title")
title3 = soup.find("meta",  property="og:description")

print("TITLE: "+str(title['content']))
print("TITLE2: "+str(title2['content']))
print("TITLE3: "+str(title3['content']))

输出:

File "twitscrape2.py", line 42, in <module>
    print("TITLE: "+str(title['content']))
TypeError: 'NoneType' object has no attribute '__getitem__'

标签: pythonhtmlbeautifulsouppython-requests

解决方案


推荐阅读