首页 > 解决方案 > BeautfilSoup 返回“无”

问题描述

我尝试制作一个脚本来跟踪亚马逊的价格。但我不明白为什么它给我这个错误:

Traceback (most recent call last):
  File "scraping_amazon.py", line 12, in <module>
    price = soup.find('span', class_ = 'a-size-medium a-color-price priceBlockBuyingPriceString').text
AttributeError: 'NoneType' object has no attribute 'text'

到目前为止,这是我的脚本:

import requests
from bs4 import BeautifulSoup

URL = 'https://www.amazon.de/Sony-Vollformat-Digitalkamera-Megapixel-SEL-2870/dp/B00FWUDEEC/ref=sr_1_4?__mk_de_DE=%C3%85M%C3%85%C5%BD%C3%95%C3%91&dchild=1&keywords=sony+a7&qid=1604245969&quartzVehicle=5-672&replacementKeywords=sony&sr=8-4'

page = requests.get(URL)

soup = BeautifulSoup(page.text, 'html.parser')

price = soup.find('span', class_ = 'a-size-medium a-color-price priceBlockBuyingPriceString').text

print(price)

我遵循与我的其他网络抓取脚本相同的过程,他们正在工作,但不是他。

有任何想法吗 ?谢谢。

标签: python-3.xweb-scrapingbeautifulsoup

解决方案


页面内容是使用动态加载的javascript。您必须使用类似selenium的东西才能抓取动态加载的页面。这是执行此操作的完整代码:

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
import time
URL = 'https://www.amazon.de/Sony-Vollformat-Digitalkamera-Megapixel-SEL-2870/dp/B00FWUDEEC/ref=sr_1_4?__mk_de_DE=%C3%85M%C3%85%C5%BD%C3%95%C3%91&dchild=1&keywords=sony+a7&qid=1604245969&quartzVehicle=5-672&replacementKeywords=sony&sr=8-4'

driver = webdriver.Chrome()
driver.get(URL)

time.sleep(4)

soup = BeautifulSoup(driver.page_source,'html5lib')

price = soup.find('span', class_ = 'a-size-medium a-color-price priceBlockBuyingPriceString').text

print(price)

driver.close()

输出:

962,16 €

推荐阅读