python - BeautifulSoup“查找”方法莫名其妙地返回NoneType
问题描述
我正在使用 BeautifulSoup 模块来查找不同种类的果冻真菌的图像和站点链接,将它们写入 html 文件,并将它们显示给用户。这是我的代码:
import os
import cfscrape
import webbrowser
from bs4 import BeautifulSoup
spider = cfscrape.CloudflareScraper()
#Creating a session.
with spider:
#Scraping the contents of the main page.
data = spider.get("https://en.wikipedia.org/wiki/Jelly_fungus").content
#Grabbing data on each of the types of jelly fungi.
soup = BeautifulSoup(data, "lxml")
ul_tags = soup.find_all("ul")
mushroom_hrefs = ul_tags[1]
#Creating list to store page links.
links = []
#Grabbing the page links for each jelly fungi, and appending them to the links list.
for mushroom in mushroom_hrefs.find_all("li"):
for link in mushroom.find_all("a", href=True):
links.append(link["href"])
#Creating list to store image links .
images = []
#Grabbing the image links from each jelly fungi's page, and appending them to the images list.
for i, link in enumerate(links, start=1):
link = "https://en.wikipedia.org/" + link
data = spider.get(link).content
soup = BeautifulSoup(data, "lxml")
fungus_info = soup.find("table", {"class": "infobox biota"})
print(i)
img = fungus_info.find("img")
images.append("https:" + img["src"])
#Checking for an existing html file, if there is one, delete it.
if os.path.isfile("fungus images.html"):
os.remove("fungus images.html")
#Iterating through the jelly fungi images and placing them accordingly in the html file.
for i, img in enumerate(images):
links[i] = "https://en.wikipedia.org" + links[i]
with open("fungus images.html", "a") as html:
if i == 0:
html.write(f"""
<DOCTYPE! html
<html>
<head>
<title>Fungus</title>
</head>
<body>
<h1>Fungus Images</h1>
<a href="{links[i]}">
<img src="{img}">
</a>
""")
elif i < len(images):
html.write(f"""
<a href="{links[i]}">
<img src="{img}">
</a>
""")
else:
html.write(f"""
<a href="{links[i]}">
<img src="{img}">
</a>
</body>
</html>
""")
webbrowser.open("fungus images.html")
在第 45 行,我开始遍历每个真菌的页面,以便找到包含其图片的信息表。这适用于前 17 页,但由于某种原因,在 Tremeldendron 真菌上返回 NoneType 值。我不知道为什么会这样,因为它的桌子与其他真菌具有相同的类别。
解决方案
推荐阅读
- c++ - 为什么我运行我的爬虫得到的 HTTP 响应数据不完整?
- ios - swiftui blendMode 与草图组合模式不匹配
- optimization - 具有隐式梯度计算的 Pytorch LBFGS
- javascript - 如何根据 JavaScript 中的条件对数组中的对象求和
- android - Android RecyclerView 在 notifyDataSetChanged 调用上冻结 UI
- c - 为什么将原始字节写入分区不起作用[扇区写入]?
- c# - C# 中 EML 到 MSG 转换的 Outlook 兑换
- sql-server - MS SQL 服务器 - 如何过滤最小字符长度
- git - Eclipse - 没有更改时的 Git Push HEAD
- c# - 即使我使用 try{}catch{},仍然会收到 NullReferenceException