python - 不能使用图像 ID 来使它们成为合格的图像链接
问题描述
我正在尝试使用请求模块从该网页中抓取所有图像链接。当我使用此链接时,我只能向上抓取图像链接,直到向下滚动时显示的其余内容。但是,如果我使用这个链接,我可以通过增加附加到链接的最后一个数字来获取所有图像 ID。问题是我不能重用这些 id 来使它们成为完整的图像链接。
我试过:
import requests
from bs4 import BeautifulSoup
url = 'https://stocksnap.io/api/search-photos/phone/relevance/desc/1'
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36'
r = s.get(url)
for item in r.json()['results']:
print(item['img_id'])
如何从该网站的登录页面获取所有图片链接?
PS 前几个赞助图片链接应该被忽略,因为它们也不包含在 api 中。
解决方案
检查页面,图像 URL 由 ID 和从 API 获得的前两个标签构成:
import requests
url = 'https://stocksnap.io/api/search-photos/phone/relevance/desc/{}'
page = 1
while True:
data = requests.get(url.format(page)).json()
if not data['results']:
break
for r in data['results']:
print('https://stocksnap.io/photo/{}-{}-{}'.format(r['keywords'][0], r['keywords'][1], r['img_id']))
page += 1
印刷:
...
https://stocksnap.io/photo/iphone-cellphone-LNXYMM77SS
https://stocksnap.io/photo/business-technology-OGLUHZAPGF
https://stocksnap.io/photo/samsung-android-7ZALGLUAAW
https://stocksnap.io/photo/apple-macbook-55A6840521
https://stocksnap.io/photo/woman-talking-54C3E9FE9D
https://stocksnap.io/photo/samsung-galaxy-BB3307280A
https://stocksnap.io/photo/parc-bench-3D99A31C0C
https://stocksnap.io/photo/iphone-cellphone-E2C541A7DC
https://stocksnap.io/photo/iphone-mockup-167A645BDC
https://stocksnap.io/photo/mac-keyboard-BA9AFFE0BF
https://stocksnap.io/photo/sony-android-EB939B3311
https://stocksnap.io/photo/iphone-cellphone-B962ABCAC7
https://stocksnap.io/photo/building-man-D49A8BB4AE
https://stocksnap.io/photo/technology-computer-C9B37875B9
https://stocksnap.io/photo/iphone-cellphone-381F0FD1EE
https://stocksnap.io/photo/work-bag-96E1A8F1CB
https://stocksnap.io/photo/iphone-phone-70FE8C00C9
https://stocksnap.io/photo/iphone-mockup-9FCDF4E1F5
https://stocksnap.io/photo/young-girl-BE8BA006E6
https://stocksnap.io/photo/young-girl-7174B21D56
https://stocksnap.io/photo/man-woman-6XELVX8KAN
https://stocksnap.io/photo/nexus-smartphones-UAXILBRNUL
编辑:要获取.jpg
链接,同样的方法适用:
import requests
url = 'https://stocksnap.io/api/search-photos/phone/relevance/desc/{}'
page = 1
while True:
data = requests.get(url.format(page)).json()
if not data['results']:
break
for r in data['results']:
print('https://cdn.stocksnap.io/img-thumbs/280h/{}-{}_{}.jpg'.format(r['keywords'][0], r['keywords'][1], r['img_id']))
page += 1
印刷:
...
https://cdn.stocksnap.io/img-thumbs/280h/iphone-cellphone_B962ABCAC7.jpg
https://cdn.stocksnap.io/img-thumbs/280h/building-man_D49A8BB4AE.jpg
https://cdn.stocksnap.io/img-thumbs/280h/technology-computer_C9B37875B9.jpg
https://cdn.stocksnap.io/img-thumbs/280h/iphone-cellphone_381F0FD1EE.jpg
https://cdn.stocksnap.io/img-thumbs/280h/work-bag_96E1A8F1CB.jpg
https://cdn.stocksnap.io/img-thumbs/280h/iphone-phone_70FE8C00C9.jpg
https://cdn.stocksnap.io/img-thumbs/280h/iphone-mockup_9FCDF4E1F5.jpg
https://cdn.stocksnap.io/img-thumbs/280h/young-girl_BE8BA006E6.jpg
https://cdn.stocksnap.io/img-thumbs/280h/young-girl_7174B21D56.jpg
https://cdn.stocksnap.io/img-thumbs/280h/man-woman_6XELVX8KAN.jpg
https://cdn.stocksnap.io/img-thumbs/280h/nexus-smartphones_UAXILBRNUL.jpg
推荐阅读
- go - 为什么 /user/local/go 在 GoLand 中未被识别为 Go SDK
- javascript - axios API - 等待响应
- directus - directus:手动插入directus_files表后重新生成缩略图
- python - TypeError:没有找到与指定签名匹配的循环,并且为 ufunc inv 找到了强制转换
- mqtt - 为什么 TwinCat 3 Analytics Data Logger 无法连接到我的 MQTT 服务器?
- python - 网络图中从一个节点到所有其他节点的最优最短路径集
- flutter - 如何在 Flutter 中创造全局价值
- javascript - 如何搜索多张纸?
- assembly - RISC-V 陷阱处理程序重入陷阱处理程序中的异常
- google-apps-script - 如何将一系列计算出的单元格值存储到连续的 Google 表格单元格中?