首页 > 解决方案 > BeautifulSoup 没有使用 CSS 选择器返回所有元素

问题描述

我正在使用.select()BeautifulSoup,但我不确定为什么只返回部分预期结果。

我的 HTML 格式为

<div class="a">
  <a class="class-type">
  <a class="class-type">
  <a class="class-type">
  <a class="class-type">
  .... {12 times}
</div>
<div class="a">
  <a class="class-type">
  <a class="class-type">
  <a class="class-type">
  <a class="class-type">
  .... {12 times}
</div>
<div class="a">
  <a class="class-type">
  <a class="class-type">
  <a class="class-type">
  <a class="class-type">
  .... {12 times}
</div>

代码:

soup = BeautifulSoup(html, 'lxml')
item_urls = soup.select(".css-ix8km1")

12当我期望退回 36 件物品时,只返回物品

标签: pythonbeautifulsoupcss-selectors

解决方案


正如 cody 已经提到的,您将需要使用诸如 selenium 之类的机制。我尝试了向下翻页并能够使用以下代码获得输出。在应用向下页面之前,您需要通过单击“X”按钮来关闭弹出广告。

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import selenium
driver = webdriver.Chrome(executable_path='/home/bitto/chromedriver') #change this
driver.get("https://www.sephora.com/shop/face-makeup?pageSize=300")
#to close the popup ad
try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.XPATH, "//button[@class='css-1mfnet7 ']"))
    )
    element.click()
except selenium.common.exceptions.TimeoutException:
    print("Ad was not found")
time.sleep(1) #not preferred but will do for now
elem = driver.find_element_by_tag_name("body")
item_urls=[]
no_of_pagedowns = 3

while no_of_pagedowns:
    elem.send_keys(Keys.PAGE_DOWN)
    time.sleep(5) #not preferred but will do for now
    no_of_pagedowns-=1
post_elems =driver.find_elements_by_xpath("//a[@class='css-ix8km1']")
for elem in post_elems:
    item_urls.append(elem.get_attribute("href"))
print(item_urls)

输出

['https://www.sephora.com/product/pro-filtr-soft-matte-longwear-foundation-P87985432?icid2=products%20grid:p87985432:product', 'https://www.sephora.com/product/pro-filt-r-instant-retouch-concealer-P88779809?icid2=products%20grid:p88779809:product', 'https://www.sephora.com/product/radiant-creamy-concealer-P377873?icid2=products%20grid:p377873:product', 'https://www.sephora.com/product/translucent-loose-setting-powder-P109908?icid2=products%20grid:p109908:product', 'https://www.sephora.com/product/pro-filt-r-instant-retouch-setting-powder-P88779810?icid2=products%20grid:p88779810:product', 'https://www.sephora.com/product/diamond-bomb-all-over-diamond-veil-P85225585?icid2=products%20grid:p85225585:product', 'https://www.sephora.com/product/the-silk-canvas-P428661?icid2=products%20grid:p428661:product', 'https://www.sephora.com/product/pineapple-my-eye-collector-s-set-P435947?icid2=products%20grid:p435947:product', 'https://www.sephora.com/product/double-wear-stay-in-place-makeup-P378284?icid2=products%20grid:p378284:product', 'https://www.sephora.com/product/ultra-hd-invisible-cover-foundation-P398321?icid2=products%20grid:p398321:product', 'https://www.sephora.com/product/all-nighter-long-lasting-makeup-setting-spray-P263504?icid2=products%20grid:p263504:product', 'https://www.sephora.com/product/your-skin-but-better-cc-cream-spf-50-P411885?icid2=products%20grid:p411885:product', 'https://www.sephora.com/product/luminous-silk-foundation-P393401?icid2=products%20grid:p393401:product', 'https://www.sephora.com/product/born-this-way-P397517?icid2=products%20grid:p397517:product', 'https://www.sephora.com/product/born-this-way-super-coverage-multi-use-sculpting-concealer-P432298?icid2=products%20grid:p432298:product', 'https://www.sephora.com/product/lock-it-tattoo-foundation-P311138?icid2=products%20grid:p311138:product', 'https://www.sephora.com/product/fresh-face-kit-P440030?icid2=products%20grid:p440030:product', 'https://www.sephora.com/product/teint-idole-ultra-24h-long-wear-foundation-P308201?icid2=products%20grid:p308201:product', 'https://www.sephora.com/product/fauxfilter-foundation-P424302?icid2=products%20grid:p424302:product', 'https://www.sephora.com/product/creaseless-concealer-P433206?icid2=products%20grid:p433206:product', 'https://www.sephora.com/product/bareminerals-original-foundation-broad-spectrum-spf-15-P61003?icid2=products%20grid:p61003:product', 'https://www.sephora.com/product/shimmering-skin-perfector-pressed-P381176?icid2=products%20grid:p381176:product', 'https://www.sephora.com/product/tinted-moisturizer-broad-spectrum-P109936?icid2=products%20grid:p109936:product', 'https://www.sephora.com/product/veil-mineral-primer-P210575?icid2=products%20grid:p210575:product']

推荐阅读