首页 > 解决方案 > Python - Web Scraper - 不涨价

问题描述

我编写了一个(非常基本的)网络刮板来从山姆俱乐部网站上刮下产品,然后打印出产品名称和产品价格。

问题是 python 打印出相同的价格(页面上第一个产品的价格)到每个其他项目(即使名称相应地更改)。

如果我更改要抓取的页面,则价格会更改为该页面上的第一个价格,然后将其标记为其他每件商品的价格。

我不明白为什么其他一切都在工作而不是产品价格?

(旁注:价格变量看起来不优雅且令人困惑,因为 sam's clube 在服务器端将其价格分解为 3 个字段。价格 = $,价格 2 = 美元,价格 3 = 美分)

感谢您的帮助,代码如下:

import requests, bs4
from bs4 import BeautifulSoup

#makes each request look like a human request
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36',
        'X-Requested-With': 'XMLHttpRequest',
        'Accept': 'application/json, text/javascript, */*; q=0.01',
        'Cookie': 'localeEditionShown_en=true; permutive-session=^%^7B^%^22session_id^%^22^%^3A^%^22e5386dfb-c58a-4882-b0e1-2cc2d9518982^%^22^%^2C^%^22last_updated^%^22^%^3A^%^222017-11-22T19^%^3A10^%^3A04.522Z^%^22^%^7D; visid_incap_774904=4xMirl1lRNOgrnN+Sm9S1zNx61kAAAAAREIPAAAAAACAsmaAAbBYMBjQTCqLf/D6wOVO4hdnKjIF; incap_ses_151_774904=/LX+SNRqsR8SzJi7p3YYAjKgGloAAAAApdQygw8VYBxbv/wvl7Be7A==; _gat=1; _gat_subdomainTracker=1; _ga=GA1.2.1522498341.1508602188; _gid=GA1.2.1243543827.1511694421'
        }

#defines url and requests/beautifulsoup variables
url = "https://www.samsclub.com/s/gatorade"
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, 'lxml')
productlist = soup.find_all('div',class_='sc-pc-title-medium')


for products in productlist:
    name = soup.find('div',class_='sc-pc-title-medium').text.strip()
    price = soup.find('span', class_='Price-currency').text.strip()
    price2 = soup.find('span', class_='Price-characteristic').text.strip()
    price3 = soup.find('span', class_='Price-mantissa').text.strip()
    productprice = price + price2 + '.' + price3 #need to find out why its not updating
    
    results = {
        'product name': products.text.strip(),
        'product price': productprice
    }

    
    print(results) 

标签: python-3.xweb-scraping

解决方案


我会以不同的方式处理这个问题。

获取所有产品名称和所有span类别标签,Price-group然后将它们连接在一起。

就是这样:

import itertools

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.samsclub.com/s/gatorade")
soup = BeautifulSoup(page.content, 'lxml')

product_names = [
    p.getText(strip=True) for p
    in soup.find_all("div", class_="sc-pc-title-medium")
]
product_prices = [
    p["title"].split()[-1] for p in soup.find_all("span", class_="Price-group")
]

results = {
    k: v for k, v 
    in itertools.zip_longest(product_names, product_prices, fillvalue="N/A")
}

for product, price in results.items():
    print(product, price)

输出:

Gatorade Frost Variety Pack (20oz / 24pk) $14.98
Gatorade Sports Drinks Core Variety Pack (12oz / 28pk) $12.78
Gatorade Sports Drinks Liberty Variety Pack (20 oz., 24 pk.) $14.98
Gatorade Zero Variety Pack (12oz / 28pk) $12.78
Gatorade Berry Variety Pack (12 oz., 28 pk.) $12.78
Gatorade Thirst Quencher Powder, Frost Glacier Freeze (76 oz.) $9.98
Gatorade Zero Thirst Quencher Variety Pack (20 oz., 24 pk.) $14.98
Gatorade Powder Lemon-Lime (76.5oz) $9.98
Gatorade Squeeze Bottles (32oz / 3pk) $14.99
Gatorade Medium Classic Cooler (3Gal) $29.98
Selectivend CB500 Gatorade 10 Selection Drink Machine $4,638.00
Gatorade Liberty Variety Pack (12oz / 28pk) $12.78
Gatorade Fierce Variety Pack (20oz / 24pk) $12.98
Gatorade Frost Variety Pack (12oz / 28pk) $12.78
BOLT24 Fueled by Gatorade Variety Pack (16.9oz /15pk) $12.98
Gatorade Fruit Punch (20oz / 24pk) $14.98
Gatorade Lemon-Lime (20 oz., 24 pk.) $14.98
Gatorade Cool Blue (20 oz., 24 pk.) $14.98
Gatorade Orange (20 oz., 24 pk.) $14.98
Gatorade Frost Arctic Blitz (12 fl. oz, 28 pk.) $10.48
Propel Immune Support Zero Sugar Variety Pack (16.9 fl. oz., 24 pk.) $12.98
Gatorade G2 Variety Pack (20oz / 24pk) $12.98
Gatorade Variety Pack (24 oz., 12 pk.) $12.25
Gatorade Cool Blue (32 oz., 12pk.) $12.98
Gatorade Classic Variety Pack (32 oz., 12 pk.) $12.25
Gatorade Orange (32 oz., 12 pk.) $12.25
Gatorade Lemon-Lime (32 oz., 12 pk.) $12.13
Gatorade Fruit Punch (32 oz., 12 pk.) $10.98
Gatorade Fierce Fruit Punch + Berry Thirst Quencher (12oz / 28pk) $10.98
Gatorade Fierce Blue Cherry Thirst Quencher (12oz / 28pk) $10.98
Gatorade Orange Thirst Quencher (12oz / 28pk) $10.98
Gatorade Zero Glacier Freeze Thirst Quencher (12oz / 28pk) $12.25
Gatorade Power Variety Pack (20.83oz / 24pk) $5,748.00
Gatorade Frost Glacier Freeze (32oz / 12pk) N/A
Gatorade All Star Club Variety Pack (24pk/11.83oz) N/A
Gatorade Fruit Punch (20.83oz / 24pk) N/A
Selectivend WS3000/CB300 Gatorade Combo Vending Machine N/A

推荐阅读