首页 > 解决方案 > 如何使用硒慢慢滚动到页面的末尾,以便我可以获得动态加载的内容

问题描述

在一个个人项目中,我需要使用 selenium 从动态站点中抓取项目名称。

为了获取所有数据,您需要滚动到底部。

但是,如果您快速滚动到底部,您只会在底部获得项目的名称,这会变得更加棘手。无论您等待多长时间,您仍然可以获得范围内的项目。

所以我想我可以慢慢滚动到底部,但它似乎不起作用。

这是我的演示代码来说明问题

url='https://shopzetu.com/search?type=product,article,page&q=dress'
driver.get(url)

#driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") # this works but you get items at bottom only
#this scrolls slowly to end
driver.execute_script("function pageScroll() {window.scrollBy(0,50);scrolldelay = setTimeout('pageScroll()',1000);}pageScroll()")
time.sleep(2)
products =driver.find_elements_by_class_name("grid-product__content")
for product in products:
    name=product.find_element_by_class_name("grid-product__title").text
    print(name)

有任何想法吗?

额外(导入和配置)

import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--window-size=1420,1080')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)

标签: pythonseleniumselenium-webdriver

解决方案


解决方案

 page = driver.find_element_by_tag_name("html")
 page.send_keys(Keys.END)

在这种情况下如何应用

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--window-size=1420,1080')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)

url = 'https://shopzetu.com/search?type=product,article,page&q=dress'
driver.get(url)

WebDriverWait(driver, 10).until(ec.element_to_be_clickable((By.XPATH, "//button[text()='No thanks']"))).click()
page = driver.find_element_by_tag_name("html")
page.send_keys(Keys.END)
products = driver.find_elements_by_class_name("grid-product__content")
for product in products:
    name = product.find_element_by_class_name("grid-product__title").text
    print(name)
page.send_keys(Keys.END)

推荐阅读