首页 > 解决方案 > BeautifulSoup + For loop Zip - 如何尝试/排除其中一个

问题描述

得到以下代码:

from bs4 import BeautifulSoup
import requests
import re

# Source Sites
mimo = 'https://tienda.mimo.com.ar/mimo/junior/ropa-para-ninas.html'
cheeky = ''
grisino = ''


source = requests.get(mimo).text

soup = BeautifulSoup(source, 'lxml')

for name_product, old_price, special_price in zip(soup.select('h3.titprod'),
                                                  soup.select('span[id^="old-price"]'),
                                                  soup.select('span[id^="product-price"]')):
    print(f'Name: {name_product.text.strip()} |  Old price = {old_price.text.strip()} | Discounted price = {special_price.text.strip()}')

完美输出信息:

Name: TAPABOCAS |  Old price = $ 295 | Discounted price = $ 236
Name: REMERA JR TOWN |  Old price = $ 990 | Discounted price = $ 743
Name: CAMISOLA NENA DELFI |  Old price = $ 2.300 | Discounted price = $ 1.725
Name: CAMISOLA JR TRAFUL |  Old price = $ 1.550 | Discounted price = $ 1.163
Name: VESTIDO NENA DELFI |  Old price = $ 2.990 | Discounted price = $ 2.243
Name: SAQUITO JR DESAGUJADO |  Old price = $ 1.990 | Discounted price = $ 1.493
Name: JEGGING JR ENGOMADO |  Old price = $ 1.990 | Discounted price = $ 1.493

但是......有时 special_price 循环不会找到折扣价......所以我需要尝试/排除,尝试“预处理它”......但我不知道如何让它工作......

special_prices_with_defaults_added = []
for sp in soup.select('span[id^="product-price"]'):
    try:
        special_prices_with_defaults_added.append(sp.text.strip())
    except:
        special_prices_with_defaults_added.append("No default price available")

for name_product, old_price, special_price in zip(
    soup.select('h3.titprod'), soup.select('span[id^="old-price"]'), special_prices_with_defaults_added):
    print(f'Name: {name_product.text.strip()} |  Old price = {old_price.text.strip()} | Discounted price = {special_prices_with_defaults_added}')

错误输出:

Name: TAPABOCAS |  Old price = $ 295 | Discounted price = ['$\xa0236', '$\xa0743', '$\xa01.725', '$\xa01.163', '$\xa02.243', '$\xa01.493', '$\xa01.493', '$\xa02.925', '$\xa0668', '$\xa0713', '$\xa01.688', '$\xa01.268', '$\xa0593', '$\xa0743', '$\xa01.125', '$\xa03.300', '$\xa02.175', '$\xa0743', '$\xa01.493', '$\xa0863', '$\xa0668', '$\xa0792', '$\xa01.520', '$\xa01.760', '$\xa0696', '$\xa03.150', '$\xa03.520', '$\xa0712', '$\xa01.352', '$\xa01.112', '$\xa01.112', '$\xa01.192', '$\xa02.800', '$\xa02.720', '$\xa03.920', '$\xa01.920']
Name: REMERA JR TOWN |  Old price = $ 990 | Discounted price = ['$\xa0236', '$\xa0743', '$\xa01.725', '$\xa01.163', '$\xa02.243', '$\xa01.493', '$\xa01.493', '$\xa02.925', '$\xa0668', '$\xa0713', '$\xa01.688', '$\xa01.268', '$\xa0593', '$\xa0743', '$\xa01.125', '$\xa03.300', '$\xa02.175', '$\xa0743', '$\xa01.493', '$\xa0863', '$\xa0668', '$\xa0792', '$\xa01.520', '$\xa01.760', '$\xa0696', '$\xa03.150', '$\xa03.520', '$\xa0712', '$\xa01.352', '$\xa01.112', '$\xa01.112', '$\xa01.192', '$\xa02.800', '$\xa02.720', '$\xa03.920', '$\xa01.920']

标签: pythonhtmlparsingweb-scrapingbeautifulsoup

解决方案


正如@furas 所说......这只是对 for 循环调用的一个小修复。

for name_product, old_price, special_price in zip(
        soup.select('h3.titprod'), soup.select('span[id^="old-price"]'), special_prices_with_defaults_added):
    print(
        f'Name: {name_product.text.strip()} |  Old price = {old_price.text.strip()} | Discounted price = {special_price}')

推荐阅读