首页 > 解决方案 > 无法抓取循环页面的内容(下一页)

问题描述

我试图用 selenium python 抓取一个分页站点。我编写的代码能够从第一页提取数据并继续到第 2 页,但它无法提取第 2 页和其余页面的内容。

我只得到了第 1 页的结果

from selenium import webdriver
import time
browser = webdriver.Chrome(executable_path='C:\Python27\Scripts\chromedriver.exe')



browser.get("https://www.etsy.com/ca/c/jewelry/necklaces" )


posts= browser.find_elements_by_class_name("text-gray")

for post in posts:

  print post.text

for i in range(1,3):
   u=browser.get('https://www.etsy.com/ca/c/jewelry/necklaces?ref=pagination&page=%s' % str(i))

   print".................................."+ str(i)+"......................................."
time.sleep(10)   
new= u.find_element_by_class_name("text-gray")
for we in new:
   print we.text

这是我收到的错误消息:AttributeError: 'NoneType' object has no attribute 'find_elements_by_class_name

标签: pythonseleniumweb-scraping

解决方案


尝试这个:

from selenium import webdriver 

import time 

browser = webdriver.Chrome(executable_path='C:\Python27\Scripts\chromedriver.exe')
browser.get("https://www.etsy.com/ca/c/jewelry/necklaces" )
posts= browser.find_elements_by_class_name("text-gray")

for post in posts:
    print post.text

for i in range(1,3):
    gets = 'https://www.etsy.com/ca/c/jewelry/necklaces?ref=pagination&page='+str(i)
    u = browser.get(gets)
    time.sleep(10)
    new = u.find_element_by_class_name("text-gray") 
    for we in new: 
        print we.text

推荐阅读