python - Selenium 找不到 javascript 生成的元素
问题描述
所以,我想做的是启动浏览器,获取页面内容(渲染 JavaScript)并使用 BeautifulSoup 找到我想要的元素,这是我的代码:
from selenium import webdriver
from bs4 import BeautifulSoup as bs4
from selenium.webdriver.support.ui import WebDriverWait
browser = webdriver.Edge()
browser.get('https://www.premierleague.com/match/22721')
element = WebDriverWait(browser, 10)
html=bs4(browser.page_source,'html.parser')
print(html.body.main.find('div',attrs={'class':'mcTabs'}))
browser.quit()
我从打印语句中得到 None
解决方案
首先,您的代码中有错字:
print(html.body.main.find('div',attrs='class':'mcTabs'}))
应替换为:
print(html.body.main.find('div',attrs={'class':'mcTabs'})) # { is missing
第二件事:
element = WebDriverWait(browser, 10)
是多余的,因为您没有element
在任何地方使用。
现在回到问题本身。我不是很熟悉BeautifulSoup
,但我发现是这样的:
browser.get('https://www.premierleague.com/match/22721')
# wait for element to be present
WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div.mcTabs")))
# get page source when element is already present
html = bs4(driver.page_source,'html.parser')
print(html.body.main.find('div', attrs={'class':'mcTabs'}).prettify())
说明:您正在获取page_source
尚未完全准备好的文档,这就是为什么您必须等到div.mcTabs
将出现在DOM
然后才获取page_source
。
输出:
<div class="mcTabs">
<section class="mcLatestContainer mcMainTab active" data-ui-args='{"type": "latest"}' data-ui-tab="Latest">
<nav class="tabs" data-built-class="matchLatestContainer" data-script="pl_tabbed" data-tab-class="mcLatestTab" data-tab-wrap=".tabs" data-widget="tabbed-content">
</nav>
<div class="matchLatestContainer">
<nav class="tabs">
<ul class="tablist" role="tablist">
<li class="active" data-tab-index="0" role="tab" tabindex="0">
Latest
</li>
<li data-tab-index="1" role="tab" tabindex="0">
Photos
</li>
</ul>
</nav>
<div class="blogStreamMatchContainer mcLatestTab active" data-tab-aware-default="true" data-ui-tab="Latest">
<div class="preMatchContainer" style="display: none;">
<div class="matchPreviewStreamContainer">
</div>
<p class="noContentAvailableContainer" style="display: none;">
No Content Available
</p>
</div>
<div class="liveMatchContainer" style="">
<section class="matchBlog">
<div class="wrapper">
<div class="mcBlogStream">
<div class="matchReportStreamContainer" data-report-rendered="true">
<header>
<h3 class="subHeader">
Match summary
</h3>
</header>
<div class="wrapper col-12">
<div class="standardArticle">
<p>
Manuel Lanzini scored twice as West Ham United finished the season with a 3-1 win over Everton.
</p>
<p>
The midfielder opened the scoring from the edge of the area on 39 minutes after latching on to Marko Arnautovic's flick of a Cheikhou Kouyate pass.
</p>
<p>
Arnautovic doubled the lead in the 63rd minute with a fierce shot for his 11th goal of the season.
</p>
...
推荐阅读
- c# - XML to Linq C# 从 XML 检索条目
- ember.js - 如何在模板中的 IF 语句中使用带参数的函数
- sql - 如何根据序列从子/连接表的多行中选择结果集中的单行?
- swift - 点击 WatchKit 表格行时崩溃
- html - 带有选择标签的表单
- angular - 使用 Angular 2 和 .Net Core API 下载文件
- microsoft-graph-api - 匿名上传到 OneDrive 上的公开共享文件夹
- google-cloud-platform - GCP 部署管理器删除 RESOURCE_ERROR
- css - 这个moodle测验页面底部有一个下拉菜单,如何删除?
- f# - 如何通过引用将对象的属性传递给函数并对其进行变异?