python - Selenium 返回错误的元素,选择第一个兄弟元素而不是查看元素本身
问题描述
我正在尝试遍历元素列表并打印文本,但是当我在另一个元素内部选择一个元素时,selenium 返回第一个兄弟元素内部的元素,而不是我真正感兴趣的元素内部的元素其中只是,令人难以置信的奇怪和令人沮丧。 https://www.thecompleteuniversityguide.co.uk/courses/details/computing-science-with-a-year-in-industry-bsc/54983514 这是我试图从中获取的网站,我正在寻找在模块部分。我的代码的关键部分:
import time
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
opts = Options()
opts.add_argument('--headless')
driver = Chrome(executable_path = 'D:\Programs\Python\chromedriver.exe', options = opts)
driver.get("https://www.thecompleteuniversityguide.co.uk/courses/details/computing-science-with-a-year-in-industry-bsc/54983514")
closeButton = driver.find_element_by_xpath("//a[@id='closeFilter']")
closeButton.click()
driver.find_element_by_xpath("//a[@id='acceptCookie']").click()
modules_container = driver.find_element_by_xpath("//div[@data-sub-sec='Modules']").find_element_by_class_name("cdsb_rt")
numberOfModulesByYear = len(modules_container.find_elements_by_xpath("//div[@class='mdldv']"))
previousNumberOfModules = 0
for moduleYear in range(numberOfModulesByYear):
moduleYearButtonString = "//div[@class='mdldv' and @data-module-sections='{}']".format(str(moduleYear))
module_year = modules_container.find_element_by_xpath(moduleYearButtonString)
module_year_a = module_year.find_element_by_tag_name("a")
time.sleep(0.5)
while module_year_a.find_element_by_tag_name("span").get_attribute("class") == "icon icon-add":
module_year_a.click()
while len(module_year.find_elements_by_xpath("//div[@class='mdiv']")) - previousNumberOfModules == 0:
time.sleep(0.01)
listOfModules = module_year.find_elements_by_xpath("//div[@class='mdiv']")
previousNumberOfModules = len(module_year.find_elements_by_xpath("//div[@class='mdiv']"))
for _, module in enumerate(listOfModules):
print(module.find_element_by_tag_name("a").find_element_by_xpath("//span[@class='mdltxt']").get_attribute("outerHTML"))
print("\n")
我得到的输出是:
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
这对我来说没有任何意义吗?当我检查 a 元素 HTML 时,它显示正确的名称,但是当我尝试通过 xpath 函数访问它时,它返回错误的名称?谁能帮助弄清楚为什么会发生这种情况?如果这是预期的行为,这似乎非常不直观。
编辑:对于将来可能阅读此内容的任何人,我对 xpath 进行了更多研究,并且在查看了解释这一点的网站之后,如果您想查看当前节点,并且仅查看当前节点子元素,请使用 xpath 开始".//"
,句号表示它只会查看该元素,而 // 表示它是相对的(或者我相信)不是 xpath 问题,只是一个简单的格式问题,对于这种东西的新手来说可能会很可怕。祝所有这样做的人好运!
解决方案
这似乎是相对 xpath 的问题?我不太确定。但是当我使用类名来查找它工作的元素时:
print(module.find_element_by_tag_name("a").find_element_by_class_name('mdltxt').get_attribute("outerHTML"))
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Database Systems (20 credits) - Core</span>
<span class="mdltxt">Web-Based Programming (20 credits) - Core</span>
<span class="mdltxt">Systems Development (20 credits) - Core</span>
<span class="mdltxt">Computing Principles (20 credits) - Core</span>
<span class="mdltxt">Programming 1 (20 credits) - Core</span>
<span class="mdltxt">Database Systems (20 credits) - Core</span>
<span class="mdltxt">Web-Based Programming (20 credits) - Core</span>
<span class="mdltxt">Systems Development (20 credits) - Core</span>
<span class="mdltxt">Computing Principles (20 credits) - Core</span>
<span class="mdltxt">Software Engineering 1 (20 credits) - Core</span>
<span class="mdltxt">Programming 2 (20 credits) - Core</span>
<span class="mdltxt">Architectures and Operating Systems (20 credits) - Core</span>
<span class="mdltxt">Data Structures and Algorithms (20 credits) - Core</span>
<span class="mdltxt">Year in Industry (80 credits) - Core</span>
<span class="mdltxt">Industrial Project Report (40 credits) - Core</span>
推荐阅读
- c - 如何访问不相关和以前运行的进程的标准输入/标准输出?
- javascript - 使用 async / await 未按正确顺序触发的函数
- ocaml - OCaml 中的 Fizbuzz 得到错误“与类型单元不兼容”
- c# - 从 NetworkStream 读取数据
- ffmpeg - ffmpeg m4a/m4b/mp4 输出文件的“时间”值在读入 iTunes 时不正确
- python-3.x - 如何在 if...else 中匹配所有内容?
- laravel - 错误:在 laravel 的 UserController 中创建新用户时的 array_key_exists()
- javascript - 在 Node.js 中使用 htmlparser2 选择 html 节点的文本内容
- java - Hibernate Search java spring,仅搜索具有指定ID的实体
- php - Laravel 该路由不支持 GET 方法。支持的方法:POST