python - For循环没有收集网络抓取的所有数据
问题描述
我正在为一个项目制作这个网络抓取,但它只返回我正在寻找的值之一,而不是同时运行列表中的其他 18 个元素。它将返回 1 所房子的所有信息,但我希望其他 18 所房子的信息也存储在变量中。非常感谢。
'''
import requests
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
my_url = "https://www.daft.ie/ireland/property-for-sale/"
#open connection and grab webpage
uClient = uReq(my_url)
#store html in a variable
page_html = uClient.read()
#close web connection
uClient.close()
#parse html
soup = BeautifulSoup(page_html, "html.parser")
print(soup)
#grabs listings house information
listings = soup.findAll("div", {"class":"FeaturedCardPropertyInformation__detailsContainer"})
for container in listings:
#extracting price
price= container.div.div.strong.text
#location
name_container = container.div.find("a", {"class":"PropertyInformationCommonStyles__addressCopy-
-link"}).text
#house type
house = container.div.find("div", {"class":"QuickPropertyDetails__propertyType"}).text
#number of bathrooms
bath_num = container.div.find("div", {"class":"QuickPropertyDetails__iconCopy--
WithBorder"}).text
#number of bedrooms
bed_num = container.div.find("div", {"class":"QuickPropertyDetails__iconCopy"}).text
'''
解决方案
您可以在循环之前简单地创建一个空白列表,for
并在每次迭代中附加所有变量以将所有数据存储在一个列表中。
您的代码将如下所示:
data = []
for container in listings:
# extracting price
price = container.div.div.strong.text
# location
name_container = container.div.find("a", {"class": "PropertyInformationCommonStyles__addressCopy--link"}).text
# house type
house = container.div.find("div", {"class": "QuickPropertyDetails__propertyType"}).text
# number of bathrooms
bath_num = container.div.find("div", {"class": "QuickPropertyDetails__iconCopy--WithBorder"}).text
# number of bedrooms
bed_num = container.div.find("div",{"class": "QuickPropertyDetails__iconCopy"}).text
data.append((price, name_container, house, bath_num, bed_num))
print(data)
您的最终输出将如下所示:
[('€1,350,000', 'The Penthouse at Hanover Quay, 27 Hanover Dock, Grand Canal Dock, Dublin 2', 'Apartment for sale', '2', '3'), ('€450,000', '9 Na Ceithre Gaoithe Ring, Dungarvan, Co. Waterford', ' Detached House', '4', '5'), ('€390,000', 'Cave, Caherlistrane, Co. Galway', ' Detached House', '4', '5'), ('€720,000', '18 Hazelbrook Road, Terenure, Terenure, Dublin 6', ' Detached House', '3', '4'), ('€210,000', 'Carraig Abhainn, Ballisodare, Co. Sligo', 'Bungalow for sale', '1', '3'), ('€495,000', 'Campbell Court, Cairns Hill, Sligo, Co. Sligo', ' Detached House', '4', '4'), ('€125,000', '33 Leim An Bhradain, Gort Road, Ennis, Co. Clare', 'Apartment for sale', '2', '2'), ('€395,000', '1 Windermere Court, Bishopstown, Bishopstown, Co. Cork', ' End of Terrace House', '3', '4'), ('€349,000', '59 Dun Eoin, Ballinrea Road, Carrigaline, Co. Cork', ' Detached House', '3', '4'), ('€515,000', '2 Elm Walk, Classes Lake, Ovens, Co. Cork', ' Detached House', '5', '4'), ('€490,000', '9 Munster st., Phibsborough, Dublin 7', ' Terraced House', '2', '4'), ('€249,950', '47 Westfields, Clare Road, Ennis, Co. Clare', ' Detached House', '3', '4'), ('€435,000', '3 Castlelough Avenue, Loreto Road, Killarney, Co. Kerry', ' Detached House', '3', '4'), ('€620,000', 'Beaufort House, Knockacleva, Philipstown, Dunleer, Dunleer, Co. Louth', ' Detached House', '3', '5'), ('€550,000', "Flat 5, Saint Ann's Apartments, Donnybrook, Dublin 4", 'Apartment for sale', '2', '2'), ('€675,000', '3 Church Hill, Innishannon, Co. Cork', ' Detached House', '3', '5'), ('€495,000', 'River Lodge, The Rower, Inistioge, Co. Kilkenny', ' Detached House', '4', '4'), ('€325,000', 'Coolgarrane House, Coolgarrane, Thurles, Co. Tipperary', ' Detached House', '1', '4'), ('€399,950', 'No 14 Coopers Grange Old Quarter, Ballincollig, Co. Cork', ' Semi-Detached House', '3', '4')]
推荐阅读
- java - 相机请求权限带来错误
- java - 将 maven 插件上传到 jFrog
- python - 将数据框结果复制到 Pandas 中的 SettingWithCopyWarning
- python - 如何使用 Flask 将参数传递给 Electron 中的 UI?
- html - 使用 HTML 和 CSS 淡化横幅适用于除 Internet Explorer 之外的所有浏览器(所有版本)
- windows - 无法从 Windows 中的 docker 机器 ping 或跟踪 google 主机名
- javascript - 净::ERR_ABORTED 404(未找到)
- rest - Vertx EventBus 回复“特定”消息
- javascript - 使用 Vue.JS 删除多维数组中的值
- javascript - 如何从js中的对象获取值?