beautifulsoup - 在 Flipkart 上使用 beautifulsoup 进行刮擦测试,但出现错误
问题描述
我试图编写一个脚本来抓取Flipkart上的数据。如下代码:
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.flipkart.com/search?q=iphone&sort=recency_desc'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div", {"class": "_3liAhj"})
container = containers[0]
for container in containers:
title_container = container.findAll("a", {"class": "_2cLu-l"})
title = title_container[0].text
price_container = container.findAll("div", {"class": "_1vC4OE"})
price = price_container[0].text
rating_container = container.findAll("span", {"class": "_2_KrJI"})
rating = rating_container[0].text
print("title : " + title)
print("price : " + price)
print("rating : " + rating)
结果是这样的:
title : Apple iPhone SE (White, 128 GB)
price : ₹47,800
rating : 4.6
结果结束时出现此错误:
Traceback (most recent call last):
File "test.py", line 22, in <module>
rating = rating_container[0].text
IndexError: list index out of range
我想是因为有些产品没有评级。错误是什么?如何避免此错误?谢谢你的帮助。
解决方案
有些项目没有评级,所以你需要照顾它。
例如(没有评分被替换-
):
import requests
from bs4 import BeautifulSoup
url = 'https://www.flipkart.com/search?q=iphone&sort=recency_desc'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
for product in soup.select('[data-id]'):
title = product.select_one('a + a')['title']
product_rating = product.select_one('span[id^="productRating_"]')
product_rating = product_rating.get_text(strip=True) if product_rating else '-'
price = product.find(lambda t: t.name=='div' and t.text.startswith('₹')).div.get_text(strip=True)
print('{:<10} {:<5} {}'.format(price, product_rating, title))
印刷:
₹47,800 4.6 Apple iPhone SE (White, 128 GB)
₹47,800 4.6 Apple iPhone SE (Red, 128 GB)
₹47,800 4.6 Apple iPhone SE (Black, 128 GB)
₹42,500 4.6 Apple iPhone SE (White, 64 GB)
₹58,300 4.6 Apple iPhone SE (Black, 256 GB)
₹42,500 4.6 Apple iPhone SE (Red, 64 GB)
₹58,300 4.6 Apple iPhone SE (Red, 256 GB)
₹42,500 4.6 Apple iPhone SE (Black, 64 GB)
₹58,300 4.6 Apple iPhone SE (White, 256 GB)
₹1,21,300 4.7 Apple iPhone 11 Pro (Space Grey, 256 GB)
₹1,17,100 4.7 Apple iPhone 11 Pro Max (Silver, 64 GB)
₹73,600 4.7 Apple iPhone 11 (Black, 128 GB)
₹999 - CallSmith Screen Guard & Protector Applicator Accessory Combo for iPhone XR/ 11
₹73,600 4.7 Apple iPhone 11 (Yellow, 128 GB)
₹68,300 4.7 Apple iPhone 11 (Yellow, 64 GB)
₹1,17,100 4.7 Apple iPhone 11 Pro Max (Gold, 64 GB)
₹73,600 4.7 Apple iPhone 11 (Green, 128 GB)
₹73,600 4.7 Apple iPhone 11 (Purple, 128 GB)
₹68,300 4.7 Apple iPhone 11 (Green, 64 GB)
₹1,17,100 4.7 Apple iPhone 11 Pro Max (Midnight Green, 64 GB)
₹73,600 4.7 Apple iPhone 11 (Red, 128 GB)
₹73,600 4.7 Apple iPhone 11 (White, 128 GB)
₹68,300 4.7 Apple iPhone 11 (White, 64 GB)
₹68,300 4.7 Apple iPhone 11 (Black, 64 GB)
₹68,300 4.7 Apple iPhone 11 (Red, 64 GB)
₹158 - HOBBYTRONICS Tempered Glass Guard for Apple iPhone 11, Apple iPhone XR
₹1,06,600 4.7 Apple iPhone 11 Pro (Silver, 64 GB)
₹1,06,600 4.7 Apple iPhone 11 Pro (Midnight Green, 64 GB)
₹68,300 4.7 Apple iPhone 11 (Purple, 64 GB)
₹1,06,600 4.7 Apple iPhone 11 Pro (Gold, 64 GB)
₹52,500 4.6 Apple iPhone XR (Blue, 64 GB)
₹52,500 4.6 Apple iPhone XR (White, 64 GB)
₹52,500 4.6 Apple iPhone XR (Yellow, 64 GB)
₹52,500 4.6 Apple iPhone XR ((PRODUCT)RED, 64 GB)
₹57,800 4.6 Apple iPhone XR (Coral, 128 GB)
₹57,800 4.6 Apple iPhone XR (White, 128 GB)
₹52,500 4.6 Apple iPhone XR (Coral, 64 GB)
₹52,500 4.6 Apple iPhone XR (Black, 64 GB)
₹135 - HOBBYTRONICS Tempered Glass Guard for Apple iPhone 11, Apple iPhone XR
₹62,999 4.7 Apple iPhone XS (Gold, 64 GB)
推荐阅读
- c# - 如何在由自定义对象创建的属性上使用 Blazor ValidationMessage
- angular - Angular - 在指令的构造函数中使用 Renderer2
- .net-core - 无法解决 IMemoryCache 依赖关系
- python-3.x - 从 CSV 导入 Postgres 在 python 中缺少一些行
- node.js - 在承诺执行器函数调用之间重用套接字的正确方法是什么?
- robotframework - 如何将字典创建包含在机器人框架中的运行关键字中
- javascript - 无法读取科尔多瓦中的属性“applicationDirectory”
- r - R传播ddply Fivenum结果
- python - Pydantic - 动态创建具有多个基类的模型?
- php - jquery 运行良好,但为什么没有数据发布在 db 中?