首页 > 解决方案 > 如何使用 Python Selenium 爬取所有产品?

问题描述

我在 Python Anaconda 中有以下代码来抓取所有 347 项以供我自己练习。但是,它只抓取前 96 个项目。我可以知道如何解决吗?我对动作链做错了吗?

点击这里查看我的图片

from selenium import webdriver
driver=webdriver.Chrome('C:/Users/cc/Documents/chromedriver.exe')
from selenium.webdriver.chrome.options import Options
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains

from urllib.request import urlopen
import requests
import ast

from bs4 import BeautifulSoup
user_input=input("input path")
prefix="https://www.toysrus.com.sg/"
url=prefix+user_input
driver.get(url)

response = requests.get(url)
response_text = response.text
soup = BeautifulSoup(response_text, 'lxml')
text = urlopen(url).read()
soup = BeautifulSoup(text)
data = soup.findAll('div',attrs={'class':'card-image-wrapper'})
toc = soup.find_all('div',attrs={'class':'result-count text-center'})
emptylist2=[]


while True:
    try:
        driver.implicitly_wait(3)
        expandable = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".button-class")))
        expandables=driver.find_element_by_xpath('/html/body/div[4]/div[3]/div[1]/div[3]/div[2]/div/div[2]/div/div[97]/div[1]/button').click()
        for item in expandables:
              ActionChains(driver).move_to_element(item).click().perform() # item.click()
    except Exception as e:
        print(e)
        break



for item in toc:
        print((item).text.strip()[:-1])
        
       
        
        for div in data:
            links = div.findAll('a')
            for a in links:
                catalogueresult=ast.literal_eval("" + a['href'][1:-5][-7:])
                print (catalogueresult)

标签: pythonselenium

解决方案


推荐阅读