python - 使用 Selenium 和 Python 解析表数据
问题描述
<table class="table table-striped">
<thead>
<tr class="reactable-column-header">
<th class="reactable-th-status reactable-header-sortable " role="button" tabindex="0"><strong></strong></th>
<th class="reactable-th-question_id reactable-header-sortable reactable-header-sort-asc" role="button"
tabindex="0"><strong>#</strong></th>
<th class="reactable-th-question_title reactable-header-sortable " role="button" tabindex="0">
<strong>Title</strong></th>
<th class="reactable-th-editorial reactable-header-sortable " role="button" tabindex="0">
<strong>Solution</strong></th>
<th class="reactable-th-acceptance reactable-header-sortable " role="button" tabindex="0">
<strong>Acceptance</strong></th>
<th class="reactable-th-difficulty reactable-header-sortable " role="button" tabindex="0">
<strong>Difficulty</strong></th>
<th class="reactable-th-frequency reactable-header-sortable " role="button" tabindex="0"><strong>Frequency
<span id="frequency-tooltip" class="fa fa-lock" data-toggle="tooltip" data-placement="top" title=""
data-original-title="Only premium members can see the frequency"></span></strong></th>
</tr>
</thead>
<tbody class="reactable-data">
<tr>
<td label="[object Object]"></td>
<td label="[object Object]">1</td>
<td value="Two Sum" label="[object Object]">
<div><a href="/problems/two-sum">Two Sum</a> </div>
</td>
<td label="[object Object]"><a href="/articles/two-sum"><i class="fa fa-file-text"></i></a></td>
<td value="44.23248982536708" label="[object Object]">44.2%</td>
<td value="[object Object]" label="[object Object]"><span class="label label-success round">Easy</span></td>
<td label="[object Object]"></td>
</tr>
<tr>
<td label="[object Object]"></td>
<td label="[object Object]">2</td>
<td value="Add Two Numbers" label="[object Object]">
<div><a href="/problems/add-two-numbers">Add Two Numbers</a> </div>
</td>
<td label="[object Object]"><a href="/articles/add-two-numbers"><i class="fa fa-file-text"></i></a></td>
<td value="31.20978757805531" label="[object Object]">31.2%</td>
<td value="[object Object]" label="[object Object]"><span class="label label-warning round">Medium</span></td>
<td label="[object Object]"></td>
</tr>
<tr>
<td label="[object Object]"></td>
</tbody>
</table>
上面的 HTML 代表图像中显示的两行。我想遍历表格行并使用 Selenium 和 Python 打印出标题(两个和并添加两个数字)。
但是,表结构太复杂了,我不确定如何制作一个通用函数,该函数可能适用于具有更多行的更大表。
有什么帮助吗?
解决方案
如果您使用selenium
并遵循xpath
它将返回表格正文中行下的所有单元格。
//table[@class='table table-striped']/tbody[@class='reactable-data']//tr//td
但是,您需要找出要查找的单元格index
或particular text
在particular tag
单元格内。
(Two Sum Add Two Numbers)
在这种情况下,你xpath
应该是
//table[@class='table table-striped']/tbody[@class='reactable-data']//tr//td[3]//a
处理动态元素总是好的诱导WebdriverWait
这是您的完整代码。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver=webdriver.Chrome(executable_path='path/to/chromedriver')
driver.get("url")
elements=WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH," //table[@class='table table-striped']/tbody[@class='reactable-data']//tr//td[3]//a")))
for ele in elements:
print(ele.text)
输出将打印在控制台上。
Two Sum
Add Two Numbers
推荐阅读
- html - 使用 ngx-translate 在 Angular 4+ 中进行本地化(国际化)?
- ios - 在firebase动态链接的ios中获取动态链接url
- julia - 如何在 Julia 中加入数据框(内、左、右、外、半、反、交叉)
- asp.net-core - 如何在会话用户声明中保留 Azure_AD id_token
- swift - 从 0° 转 359° 时如何修复我的指南针应用程序?
- angular - Angular8 HttpInterceptor 返回值
- html - CSS中带有伪元素和伪类的样式规则
- python - 想使用python在excel中打印一系列字符串
- sql - 如何在一对多关系 SQL 中获取前 x 条记录
- reactjs - 在 React 中的管理员/用户视图之间切换