首页 > 解决方案 > 在 Python 中查找 Div 与 Selenium 之间的元素

问题描述

我有以下 HTML 代码,我想提取年份和姓名,我尝试了一切都没有成功:

<div class="Year">

<span class="date">2019</span>

</div>



<div class="cl2">
    <span class="name">name1</span>
</div>
<div class="cl2">
    <span class="name">name2</span>
</div>
<div class="cl2">
    <span class="name">name3</span>
</div>
<div class="cl2">
    <span class="name">name4</span>
</div>



<div class="Year">
    <span class="date">2020</span>
</div>

<div class="cl2">
    <span class="name">name5</span>
</div>
<div class="cl2">
    <span class="name">name6</span>
</div>

我想要得到的是:

2019
name1
name2
name3
name4
2020
name5
name6

我尝试了以下,使用xpath

years = driver.find_elements_by_xpath("//div[@class='year']")

for year in years:
    
    print(year.find_element_by_xpath(".//span[@class='date']").text)

names = driver.find_elements_by_xpath("//div[@class='name']")

for name in names:
    print(name.find_element_by_xpath(".//span[@class='name']").text)

我有 :

2019

2020

名称1

名称2

名称3

名称4

名称5

名称6

标签: pythonselenium

解决方案


您可以使用和获取它们preceding

names = dict()
for e in driver.find_elements_by_class_name('name'):
    name = e.text
    year = e.find_element_by_xpath("(./preceding::span[@class='date'])[last()]").text
    names[name] = year

{'name1':'2019','name2':'2019','name3':'2019','name4':'2019','name5':'2020','name6':'2020'}

您还可以使用以下方法获取所有元素并收集class

names = dict()
year = None
for e in driver.find_elements_by_css_selector('.date, .name'):
    if 'name' in e.get_attribute('class'):
        names[e.text] = year
    if 'date' in e.get_attribute('class'):
        year = e.text

{'name1':'2019','name2':'2019','name3':'2019','name4':'2019','name5':'2020','name6':'2020'}


推荐阅读