python - 网页抓取返回无
问题描述
我正在尝试使用 request 和 bs4 从亚马逊获取显示器列表的价格-
这是代码:
from bs4 import BeautifulSoup
import re
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36',}
res = requests.get("https://www.amazon.com/s?i=specialty-aps&bbn=16225007011&rh=n%3A16225007011%2Cn%3A1292115011&ref=nav_em__nav_desktop_sa_intl_monitors_0_2_6_8", headers=headers)
print(res)
soup = BeautifulSoup(res.text, "html.parser")
price=soup.find_all(class_="a-price-whole")
print(price.text)
我不明白为什么它返回 None - 我基本上是在关注一个视频,https://www.youtube.com/watch?v=Bg9r_yLk7VY&t=467s&ab_channel=DevEd,并且在他们这边它返回文本 - 有人可以指出我做错了什么?
解决方案
您可能已经收到验证码页面。尝试添加"Accept-Language"
HTTP 标头:
import re
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36",
"Accept-Language": "en-US,en;q=0.5",
}
res = requests.get(
"https://www.amazon.com/s?i=specialty-aps&bbn=16225007011&rh=n%3A16225007011%2Cn%3A1292115011&ref=nav_em__nav_desktop_sa_intl_monitors_0_2_6_8",
headers=headers,
)
soup = BeautifulSoup(res.text, "html.parser")
prices = soup.find_all(class_="a-price-whole")
for price in prices:
print(
price.find_previous("h2").text[:30] + "...",
price.text + price.find_next(class_="a-price-fraction").text,
)
印刷:
Sceptre IPS 27-Inch Business C... 159.17
EVICIV 12.3’’ Raspberry Pi Tou... 199.99
Portable Monitor, 17.3'' IPS H... 349.99
Acer R240HY bidx 23.8-Inch IPS... 129.99
Dell SE2419Hx 24" IPS Full HD ... 169.95
HP Pavilion 22cwa 21.5-Inch Fu... 139.99
Sceptre E248W-19203R 24" Ultra... 127.98
LG 27GL83A-B 27 Inch Ultragear... 379.99
LG 24M47VQ 24-Inch LED-lit Mon... 99.99
LG 27UN850-W 27 Inch Ultrafine... 404.14
Sceptre IPS 24-Inch Business C... 142.17
Planar PXN2400 Full HD Thin Pr... 139.00
Sceptre IPS 24-Inch Business C... 142.17
Portable Triple Screen Laptop ... 419.99
ASUS ZenScreen 15.6" 1080P Por... 232.52
HP M27ha FHD Monitor - Full HD... 199.99
ASUS 24" 1080P Gaming Monitor ... 189.99
Dell P2419H 24 Inch LED-Backli... 187.99
LG 32QN600-B 32-Inch QHD (2560... 249.99
LG 29WN600-W 29" 21:9 UltraWid... 226.99
Acer Nitro XV272U Pbmiiprzx 27... 299.99
AOC C24G1 24" Curved Frameless... 186.99
Samsung CF390 Series 27 inch F... 199.00
ASUS VY279HE 27” Eye Care Moni... 219.00
SAMSUNG LC24F396FHNXZA 23.5" F... 149.99
Sceptre E275W-19203R 27" Ultra... 169.97
ASUS VG245H 24 inchFull HD 108... 164.95
PEPPER JOBS 15.6" USB-C Portab... 199.99
13.3 inch Portable Monitor,KEN... 96.99
Eyoyo Small Monitor 8 inch Min... 76.98
推荐阅读
- javascript - 我们如何区分响应式 Web 应用程序和渐进式 Web 应用程序?
- java - 尽管初始化对象,但空对象引用错误
- java - 我需要将数字的小数位四舍五入为 n 数
- android - 临时构造函数注入与依赖注入框架
- python - 如何根据几何属性将geoseries合并到geodataframe?
- python-3.x - 将列表附加到熊猫数据框的第一行和第一列
- python - 如何使用线程添加n个自然数
- python - 我可以在没有用户的情况下运行 python 脚本时对 Azure Devops 进行身份验证吗?
- mysql - 如何让mysql为where子句中提供的每个键输出行,即使它是重复的
- mysql - 从 MySQL 数据库中按最低分数获取行