selenium - 分裂 - 元素不可点击,因为另一个元素
掩盖它
问题描述
我正在尝试从一个网站上获取一些缩略图,从src
,以及单击一个链接,这样我以后可以得到大图。
为此,我使用Splinter
with BeautifulSoup
。
这是html
我需要得到的第一个元素:
为此,我有以下代码:
executable_path = {"executable_path": "/path/to/geckodriver"}
browser = Browser("firefox", **executable_path, headless=False
def get_player_images():
url = f'https://www.premierleague.com/players'
# Initiate a splinter instance of the URL
browser.visit(url)
browser.find_by_tag('div[class="table playerIndex"]')
soup = BeautifulSoup(browser.html, 'html.parser')
for el in soup:
td = el.findAll('td')
for each_td in td:
link = each_td.find('a', href=True)
if link:
print (link['href'])
image = each_td.find('img')
if image:
print(image['src'])
# run
get_player_images()
但是在浏览器打开后,我遇到了 2 个问题:
我只src
为玩家访问前两个实际。在那之后,照片丢失了,我不明白为什么。
/players/19970/Max-Aarons/overview
https://resources.premierleague.com/premierleague/photos/players/40x40/p232980.png
/players/13279/Abdul-Rahman-Baba/overview
https://resources.premierleague.com/premierleague/photos/players/40x40/p118335.png
/players/13286/Tammy-Abraham/overview
//platform-static-files.s3.amazonaws.com/premierleague/photos/players/40x40/Photo-Missing.png
/players/3512/Adam-Smith/overview
//platform-static-files.s3.amazonaws.com/premierleague/photos/players/40x40/Photo-Missing.png
/players/10905/Che-Adams/overview
....
另外,如果我尝试单击href
链接,则:
if link:
browser.click_link_by_partial_href(link['href'])
我得到错误:
selenium.common.exceptions.ElementClickInterceptedException: Message: Element <a class="playerName" href="/players/19970/Max-Aarons/overview"> is not clickable at point (244,600) because another element <p> obscures it
我究竟做错了什么?我在使用硒时遇到了很多麻烦。
解决方案
播放器数据通过 Javascript 动态加载。您可以使用requests
模块来获取信息。
例如:
import re
import json
import requests
from bs4 import BeautifulSoup
url = 'https://footballapi.pulselive.com/football/players?pageSize=30&compSeasons=274&altIds=true&page={page}&type=player&id=-1&compSeasonId=274'
img_url = 'https://resources.premierleague.com/premierleague/photos/players/250x250/{player_id}.png'
headers = {'Origin': 'https://www.premierleague.com'}
for page in range(1, 10): # <--- increase this to desired number of pages
data = requests.get(url.format(page=page), headers=headers).json()
# uncoment this to print all data:
# print(json.dumps(data, indent=4))
for player in data['content']:
print('{:<50} {}'.format(player['name']['display'], img_url.format(player_id=player['altIds']['opta'])))
印刷:
Ethan Ampadu https://resources.premierleague.com/premierleague/photos/players/250x250/p199598.png
Joseph Anang https://resources.premierleague.com/premierleague/photos/players/250x250/p447879.png
Florin Andone https://resources.premierleague.com/premierleague/photos/players/250x250/p93284.png
André Gomes https://resources.premierleague.com/premierleague/photos/players/250x250/p120250.png
Andreas Pereira https://resources.premierleague.com/premierleague/photos/players/250x250/p156689.png
Angeliño https://resources.premierleague.com/premierleague/photos/players/250x250/p145235.png
Faustino Anjorin https://resources.premierleague.com/premierleague/photos/players/250x250/p223332.png
Michail Antonio https://resources.premierleague.com/premierleague/photos/players/250x250/p57531.png
Cameron Archer https://resources.premierleague.com/premierleague/photos/players/250x250/p433979.png
Archie Davies https://resources.premierleague.com/premierleague/photos/players/250x250/p215061.png
Stuart Armstrong https://resources.premierleague.com/premierleague/photos/players/250x250/p91047.png
Marko Arnautovic https://resources.premierleague.com/premierleague/photos/players/250x250/p41464.png
Kepa Arrizabalaga https://resources.premierleague.com/premierleague/photos/players/250x250/p109745.png
Harry Arter https://resources.premierleague.com/premierleague/photos/players/250x250/p48615.png
Daniel Arzani https://resources.premierleague.com/premierleague/photos/players/250x250/p200797.png
... and so on.
注意:要获得更小的缩略图,250x250
请将图像 URL 更改为40x40
推荐阅读
- prolog - 为什么 prolog 无法找到到顶点的路径(在图中)?
- user-interface - 外部监视器的 JavaFx 问题
- c++ - 解决正交模块的依赖关系
- python - 如何使用 Selenium 和 Python 相对于 xpath 中的变量定位元素
- amazon-web-services - AWS 如何将更改通知 API 用户?
- swiftui - 从图像 swiftUI 中获取 RGB 强度
- javascript - GrapeCity ActiveReportsJS 如何为数据源传递授权标头
- javascript - Angular 9:[innerHTML] 即使是纯字符串也无法在不清理值的情况下工作
- flutter - Flutter:从 http 请求准备列表数据
- html - 如果我在使用 django 框架时我的 css 不工作,我该怎么办?