首页 > 解决方案 > 塞恩斯伯里的美味汤一无所获

问题描述

类似于美丽的汤 find 从 rightmove 没有返回,但对于不同的站点:https ://www.sainsburys.co.uk/gol-ui/SearchDisplayView?filters[keyword]=milk

我尝试运行:

url='https://www.sainsburys.co.uk/gol-ui/SearchDisplayView?filters[keyword]=banana'

# configure driver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
chrome_driver = os.getcwd() + "\\chromedriver.exe"  # IF NOT IN SAME FOLDER CHANGE THIS PATH
driver = webdriver.Chrome(options=chrome_options, executable_path=chrome_driver)
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

container_tag1='pt__content'
containers = page_soup.findAll("div",{"class":container_tag1})
# print(containers)
print(len(containers))

无济于事。

我尝试不使用硒,但也失败了。

有什么建议么?

标签: pandasbeautifulsoup

解决方案


在将 HTML 传递给BeautifulSoup. 一种选择是使用.sleep内置time模块中的方法。

from time import sleep
from selenium import webdriver
from bs4 import BeautifulSoup

URL = "https://www.sainsburys.co.uk/gol-ui/SearchDisplayView?filters[keyword]=banana"

driver = webdriver.Chrome(r"c:\path\to\chromedriver.exe")
driver.get(URL)
sleep(5)  # <-- Wait for the page to fully render

soup = BeautifulSoup(driver.page_source, "html.parser")
print(soup.find_all("div", {"class": "pt__content"}))

推荐阅读