首页 > 解决方案 > BeautifulSoup标签

问题描述

标签: python-3.xweb-scrapingbeautifulsouppython-requestsrequest

解决方案


内容是动态提供的,因此您不会以这种方式获得它requests- 看看这段selenium代码。

要摆脱文本和空格,您可以执行以下操作:

.get_text(strip=True).replace('Was: ','')

例子

from selenium import webdriver
from bs4 import BeautifulSoup
import time

url = "https://www.petshop.co.uk/Dog"
driver = webdriver.Chrome('C:\Program Files\ChromeDriver\chromedriver.exe')
driver.get(url)
time.sleep(3)

html = driver.page_source
soup = BeautifulSoup(html,'html.parser')
for old_price in soup.find_all("small", class_ = "product-views-price-old"):
    print(old_price.get_text(strip=True).replace('Was: ',''))

driver.quit()

输出

£2.20
£18.61
£27.00
£38.39
£38.39
£20.65
£1.30
£67.99
£20.65
£1.30
£54.95
£30.99

推荐阅读