python - 如何在soup.find_all中打印“未找到”?
问题描述
在某些行中,我有以下 html 代码:
<span class="price"><span class="woocommerce-Price-amount amount"><span class="woocommerce-Price-currencySymbol">SAR</span> 625.00</span> <small class="woocommerce-price-suffix">(Excluding Tax)</small></span>
在某些行中,代码不存在。如果此代码在这些行上不可用,我希望它打印“未找到”
我正在使用以下代码,但无法得到正确答案:
p=soup.findAll("span", {"class":"price"})
for price in p:
if price in p:
prices.append(price.text)
else:
prices.append("Not found")
有人可以帮我解决这个问题。
解决方案
搜索每个产品的父元素,在这种情况下<div class="product-inner">
,然后搜索价格。如果没有找到价格,则将其设置为"Not found"
。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.softland.com.sa/category/brand'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
for product in soup.find_all('div', class_='product-inner'):
title = product.h2.text
price = product.find('span', class_='amount')
if price:
price = price.text
else:
price = 'Not found'
print('{:<15} {}'.format(price, title))
印刷:
SAR 350.00 SAMSUNG SSD 860 EVO SATA III 2.5 inch 500 GB-MZ-76E500BW-887276231631
SAR 625.00 Samsung 860 EVO 1TB 2.5 Inch SATA III Internal SSD MZ-76E1T0BW-887276231648
Not found MSI Z390-A PRO 9th Gen. Motherboard
SAR 75.00 ASUS Internal 24X DVD BURNER Black (DRW-24D5MT)
SAR 2,750.00 Gaming Series #1
SAR 485.00 Samsung 970 Evo Plus M.2-500GB-MZ-V7S500BW-887276302546
SAR 395.00 CoolerMaster MasterBox MB520 Red-Trim
SAR 150.00 Xtrike Wired 4 in 1 Combo with ENG/ARABIC layout -CM-400
SAR 350.00 NZXT Aer RGB 120 Triple Pack RGB LED PWM Fan for HUE (RF-AR120-T1)
SAR 895.00 Samsung M.2-970 Evo Plus 1TB-MZ-V7S1T0BW-887276302522
SAR 395.00 TEAM DELTA R BLACK UD-D4 8GBx2 3000
SAR 350.00 Samsung M.2-970 Evo Plus 250GB-MZ-V7S250BW-887276302515
SAR 190.00 TEAM Vulcan Z RED UD-D4 8GB 2666
SAR 125.00 NZXT Aer RGB120 Single Pack 120mm Digitally Controlled RGB LED Fans for HUE (RF-AR120-B1)
SAR 190.00 Gamdias Hermes E2 7 Neon Color Mechanical Gaming Keyboard with Blue Switches (87 Keys)
SAR 560.00 Windows 10 Pro 64-bit Arabic – OEM
SAR 935.00 Corsair Crystal Series 570X RGB ATX Mid-Tower Case RED-CC-9011111-WW
SAR 325.00 BitFenix Prodigy M Window Side Panel Computer Case, Black, BFC-PRM-300-RRWKK-RP, Micro ATX / Mini-ITX
SAR 180.00 DEEPCOOL GAMMAXX 200T CPU AirCooler
SAR 480.00 Corsair Vengeance RGB Pro 16GB (2x8GB) DDR4-3200 CMW16GX4M2E3200C16
编辑:将课程更改product
为product-inner
.
要将信息作为 pandas DataFrame 获取,您可以使用以下脚本:
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = 'https://www.softland.com.sa/category/brand'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
all_data = []
for product in soup.find_all('div', class_='product-inner'):
title = product.h2.text
price = product.find('span', class_='amount')
if price:
price = price.text
else:
price = 'Not found'
all_data.append({'Title': title, 'Price': price})
df = pd.DataFrame(all_data)
print(df)
印刷:
Title Price
0 SAMSUNG SSD 860 EVO SATA III 2.5 inch 500 GB-M... SAR 350.00
1 Samsung 860 EVO 1TB 2.5 Inch SATA III Internal... SAR 625.00
2 MSI Z390-A PRO 9th Gen. Motherboard Not found
3 ASUS Internal 24X DVD BURNER Black (DRW-24D5MT) SAR 75.00
4 Gaming Series #1 SAR 2,750.00
5 Samsung 970 Evo Plus M.2-500GB-MZ-V7S500BW-887... SAR 485.00
6 CoolerMaster MasterBox MB520 Red-Trim SAR 395.00
7 Xtrike Wired 4 in 1 Combo with ENG/ARABIC layo... SAR 150.00
8 NZXT Aer RGB 120 Triple Pack RGB LED PWM Fan... SAR 350.00
9 Samsung M.2-970 Evo Plus 1TB-MZ-V7S1T0BW-88727... SAR 895.00
10 TEAM DELTA R BLACK UD-D4 8GBx2 3000 SAR 395.00
11 Samsung M.2-970 Evo Plus 250GB-MZ-V7S250BW-887... SAR 350.00
12 TEAM Vulcan Z RED UD-D4 8GB 2666 SAR 190.00
13 NZXT Aer RGB120 Single Pack 120mm Digitally Co... SAR 125.00
14 Gamdias Hermes E2 7 Neon Color Mechanical Gami... SAR 190.00
15 Windows 10 Pro 64-bit Arabic – OEM SAR 560.00
16 Corsair Crystal Series 570X RGB ATX Mid-Tower ... SAR 935.00
17 BitFenix Prodigy M Window Side Panel Computer ... SAR 325.00
18 DEEPCOOL GAMMAXX 200T CPU AirCooler SAR 180.00
19 Corsair Vengeance RGB Pro 16GB (2x8GB) DDR4-32... SAR 480.00
推荐阅读
- r - 如何在echarts4r中指定轴值的顺序
- flutter - 如何在没有小括号的Dart中访问List中Map的特定索引的键和值?
- javascript - 简单,非常简单的等待在js中
- java - 如何从 Java 中的图片中获取 0..255 种颜色?
- angular - Typescript/Angular8 在使用异步 API 显示页面之前检查用户登录信息
- symfony - 实体关系不返回预期结果
- c# - 作为 GUI 或命令行启动 WPF MVVM 应用程序?(作为服务)
- c++ - 确保模拟的 GTest 方法覆盖虚拟方法
- linux - 为什么编译Linux内核需要很大的存储空间?
- javascript - 是否可以重新评估导入的 ES6 模块?