javascript - 如何使用 SeleniumWebdriver 和 Python 通过滚动查找网页上的所有元素
问题描述
我似乎无法获取网页上的所有元素。不管我用硒尝试过什么。我确定我错过了一些东西。这是我的代码。该 url 至少有 30 个元素,但每当我抓取时只有 6 个元素返回。我错过了什么?
import requests
import webbrowser
import time
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}
url = 'https://www.adidas.com/us/men-shoes-new_arrivals'
res = requests.get(url, headers = headers)
page_soup = bs(res.text, "html.parser")
containers = page_soup.findAll("div", {"class": "gl-product-card-container show-variation-carousel"})
print(len(containers))
#for each container find shoe model
shoe_colors = []
for container in containers:
if container.find("div", {'class': 'gl-product-card__reviews-number'}) is not None:
shoe_model = container.div.div.img["title"]
review = container.find('div', {'class':'gl-product-card__reviews-number'})
review = int(review.text)
driver = webdriver.Chrome()
driver.get(url)
time.sleep(5)
shoe_prices = driver.find_elements_by_css_selector('.gl-price')
for price in shoe_prices:
print(price.text)
print(len(shoe_prices))
解决方案
你必须慢慢向下滚动页面。它仅在查看产品时使用 ajax 请求价格数据。
options = Options()
options.add_argument('--start-maximized')
driver = webdriver.Chrome(options=options)
url = 'https://www.adidas.com/us/men-shoes-new_arrivals'
driver.get(url)
scroll_times = len(driver.find_elements_by_class_name('col-s-6')) / 4 # (divide by 4 column product per row)
scrolled = 0
scroll_size = 400
while scrolled < scroll_times:
driver.execute_script('window.scrollTo(0, arguments[0]);', scroll_size)
scrolled +=1
scroll_size += 400
time.sleep(1)
shoe_prices = driver.find_elements_by_class_name('gl-price')
for price in shoe_prices:
print(price.text)
print(len(shoe_prices))
推荐阅读
- react-native - 当当前选项卡在反应导航5中处于活动状态时如何在底部选项卡顶部添加一行
- vmware - 从 vSphere 客户端获取数据
- javascript - 如何在 Puppeteer 中双击
- flutter - 在 Flutter 中禁用“语义”系统?
- c# - 仅设置 ASP .NET 站点的 ONE 路由的 Windows 身份验证
- drools - 具有链式规划变量的 PartitionedSearch 的最佳实践?
- flutter - 如何在 Flutter 中创建深按钮效果?
- apache-storm - 关于 StormCrawler 中并行的效果
- ios - 启动应用程序时不显示 LaunchScreen.Storyboard
- cmake - C:/wxPDFView-1.0/samples/simple 错误构建