首页 > 解决方案 > 如何抓取特定区域的产品价格

问题描述

作为练习,我试图从 Lowes 那里收集有关洗衣机的信息。https://www.lowes.com/pl/Washing-machines-Washers-dryers-Appliances/4294857977

要访问价格,我需要找到一个具有“产品定价”类的div,然后在其中获取span的文本。但是,当我在浏览器中检查div时,它与使用 beautifulsoup 抓取它时完全不同。当我检查它看起来像这样:

<div class="product-pricing">
<div class="pl-price js-pl-price" tabindex="-1">                 

     <!-- Was Price -->
     <div class="v-spacing-mini">
           <span class="h5 js-price met-product-price art-pl-contractPricing0" data-met-type="was">$499.00</span>
     </div>
     <div class="v-spacing-mini">
           <p class="darkMidGrey art-pl-wasPriceLbl0">was: $749.00</p>

              <small class="green small art-pl-saveThruLbl0">SAVE 33% thru 10/30/2018</small><br>
     </div>

  <!-- Start of Product Family Pricing -->

  <!-- Contractor Pack Messaging -->

  <!-- End of Product Family Pricing -->
  </div>
  <div class="v-spacing-small">
     <a role="link" tabindex="-1" data-toggle="popover" aria-haspopup="true" data-trigger="focus" data-placement="bottom auto" data-content="FREE local delivery applies to any major appliance $396 or more, full-size gas grills $498 or more, patio furniture orders $498 or more, and riding and ZTR mowers $999 or more. Applies to standard deliveries in US only. Purchase threshold calculated before taxes, after applicable discounts, if any. Additional fees may apply." data-original-title="Free Delivery" class="js-truck-delivery"><i class="icon-truck" title="FREE Delivery" aria-label="FREE Delivery."></i> <strong>FREE Delivery</strong></a>
  </div>
</div>

但是当我刮擦时,我看到的是:

<div class="product-pricing">
<div class="v-spacing-jumbo clearfix">
  <a aria-haspopup="true" class="js-enter-location" data-content="Since Lowes.com is national in scope, we check inventory at your local store first in an effort to fulfill your order more quickly. You may find product or pricing that differ from that of your local store, but we make every effort to minimize those differences so you can get exactly what you want at the best possible price." data-placement="top auto" data-toggle="popover" data-trigger="focus" role="link" tabindex="-1">
     <p class="h6" id="ada-enter-location"><span>Enter your location</span>
        <i aria-hidden="true" class="icon-info royalBlue"></i>
     </p>
  </a>
  <p class="small-type secondary-text" tabindex="-1">for pricing and availability.</p>
</div>
<form action="#" class="met-zip-container js-store-locator-form" data-modal-open="true" data-zip-in="true" id="store-locator-form">
  <input name="redirectUrl" type="hidden" value="/pl/Washing-machines-Washers-dryers-Appliances/4294857977"/>
  <div class="form-group product-form-group">
     <div class="input-group">
        <input aria-label="Enter your zip code" autocompletetype="find-a-store-search" class="form-control js-list-zip-entry-input met-zip-code" name="searchTerm" placeholder="ZIP Code" role="textbox" tabindex="-1" type="text"/>
        <span class="input-group-btn">
        <button class="btn btn-primary js-list-zip-entry-submit met-zip-submit" data-linkid="get-pricing-and-availability-zip-in-modal-submit" tabindex="-1" type="submit">OK</button>
        </span>
     </div>
     <span class="inline-help">ZIP Code</span>
  </div>
 </form>
</div>

我认为这与网站必须使用我的位置来确定正确价格有关。似乎有一个隐藏的输入,我的浏览器知道我的位置并告诉网站,有没有办法让漂亮的汤在检查我的位置后刮掉出现的价格?

这是我正在使用的代码:

import re
import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.lowes.com/pl/Washing-machines-Washers-dryers- 
Appliances/4294857977'

uClient = uReq(my_url)

page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, features = "lxml")

containers = page_soup.findAll("div", {"class":"product-wrapper-right"})
for container in containers:
    price = container.findAll("span", {"class":"js-price"})[0].text

编辑:给我第二个html的具体代码是

container.findAll("div", {"class":"product-pricing"})   

标签: pythonhtmlweb-scrapingbeautifulsoup

解决方案


不是 100% 确定这会解决您的问题,但使用 selenium 可能会有所帮助,因为它是一个实际的浏览器,并且会发送普通浏览器在访问网站时发送的数据。

Selenium 简介的链接:https ://medium.freecodecamp.org/better-web-scraping-in-python-with-selenium-beautiful-soup-and-pandas-d6390592e251


推荐阅读