首页 > 解决方案 > 出现错误:消息:过时的元素引用:元素未附加到页面文档

问题描述

转到一个页面并使用 driver.get() 返回上一页后,我在从该页面查找元素时收到此错误:

  File "C:\Users\ASUS\Desktop\x\index4.py", line y, in <module>
    basket = item.find_elements_by_xpath('xpath')
   
  (...)

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document  

我的代码:

list_url = "URL"
driver.get(list_url)

staleElement = True
while staleElement:
    staleElement = False
    driver.refresh()
    list_items = driver.find_elements_by_class_name("classname1")
    for item in list_items:
        basket = False
        try:
            basket = item.find_elements_by_xpath('xpath')
        except exceptions.StaleElementReferenceException as e:
            basket = item.find_elements_by_xpath('xpath')

        if basket[0] and "text1" in basket[0].text:
            price = item.find_elements_by_xpath('xpath1')[0].text

            item_link = item.find_element_by_class_name("classname2").get_attribute("href")

            if int(price) < 101:
                driver.get(item_link)
                if len(driver.find_elements_by_xpath('xpath2')) > 0:
                    driver.get(list_url)
                    staleElement = True
                else:
                    driver.find_element_by_xpath('xpath3').click()

标签: pythonseleniumselenium-webdriver

解决方案


Selenium 不提供真实对象,而仅提供对浏览器内存中对象的引用,当您加载新的 url (driver.get(...)click()) 时,它将新数据加载到浏览器的内存中,并且对前一页上的对象的引用已过时。即使您再次加载上一页,它们也已过时(因为对象可能位于浏览器内存中的不同位置。

您必须使用两个for循环。

如果是第一个for循环,您必须获取所有"href"( item_link) 并附加到某个列表 (而不是driver.get(item_link))。当您将所有内容都"href"放在第二个循环中时for,您可以使用driver.get(item_link).

我无法测试它,但它可能是这样的:

list_url = "URL"

staleElement = True

while staleElement:
    staleElement = False

    driver.get(list_url)  # load page instead of refreshing because in next loop it may have different page in memory
    #driver.refresh()

    list_items = driver.find_elements_by_class_name("classname1")

    # first for-loop: get all `hrefs` (as strings)
    
    all_hrefs = []  # list for all strings with `"href"`
    
    for item in list_items:
        basket = False
        try:
            basket = item.find_elements_by_xpath('xpath')
        except exceptions.StaleElementReferenceException as e:
            basket = item.find_elements_by_xpath('xpath')

        if basket[0] and "text1" in basket[0].text:
            price = item.find_elements_by_xpath('xpath1')[0].text

            item_link = item.find_element_by_class_name("classname2").get_attribute("href")

            if int(price) < 101:
                all_href.append(item_link)  # add string `"href"` to list

    # second for-loop: use all `hrefs`

    for item_link in all_hrefs:
        driver.get(item_link)
        if len(driver.find_elements_by_xpath('xpath2')) > 0:
            staleElement = True
            #driver.get(list_url)  # there is no need to go back to previous page
        #else:
        #    driver.find_element_by_xpath('xpath3').click()  # there is no need to go back to previous page

推荐阅读