首页 > 解决方案 > 在 Python 中使用 Selenium 显式等待

问题描述

我有一个用于当前部署的约 600 个调制解调器的 IP 列表。其中一些调制解调器可以访问,而另一些则由于连接性或电池没电而无法访问。基本上想要遍历所有 IP 并明确等待包含调制解调器固件版本的底部文本字符串。如果站点未加载,我希望它添加“无连接”来代替 IP。我想将该固件附加到一个空列表中,然后将该列表打印到 csv 中。

我的主要障碍似乎是 WebDriverWait(driver,30).until 部分代码似乎总是抛出异常,即使我之前从同一个定位器中提取。我看到浏览器加载,然后它仍然会抛出异常。我已经尝试了几种“标准预期条件”,但它们似乎都不起作用。

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
import os
import csv

prefix = 'http://'
suffix = ':9191'

fwlist = []
iplist = ["###.###.###.###", ...]


class presence_of_element_located(object):
    """ An expectation for checking that an element is present on the DOM
    of a page. This does not necessarily mean that the element is visible.
    locator - used to find the element
    returns the WebElement once it is located
    """
    def __init__(self, locator):
        self.locator = locator

    def __call__(self, driver):
        return _find_element(driver, self.locator)

for ip in iplist:
        url = prefix + ip + suffix
        # instantiate a chrome options object so you can set the size and headless preference
        # some of these chrome options might be uncessary but I just used a boilerplate
        # change the <path_to_download_default_directory> to whatever your default download folder is located
        chrome_options = Options()
        #chrome_options.add_argument("--headless")
        chrome_options.add_argument("--window-size=1920x1080")
        chrome_options.add_argument("--disable-notifications")
        chrome_options.add_argument('--no-sandbox')
        chrome_options.add_argument('--verbose')
        chrome_options.add_experimental_option("prefs", {
                "download.default_directory": r"Z:\Python\Scaping\Chromedriver\chromedriver.exe",
                "download.prompt_for_download": False,
                "download.directory_upgrade": True,
                "safebrowsing_for_trusted_sources_enabled": False,
                "safebrowsing.enabled": False
        })
        #chrome_options.add_argument('--disable-software-rasterizer')
        # initialize driver object and change the <path_to_chrome_driver> depending on your directory where your chromedriver should be
        driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=r"Z:\Python\Scaping\Chromedriver\chromedriver.exe")
        # get request to target the site selenium is active on
        driver.get(f'{url}')
        try:
                WebDriverWait(driver,30,.5).until(presence_of_element_located((By.ID,"login_screen")))
                print(f'{url} is ready')
        except:
                print(f'Timeout for {url}')
                fw = 'No connection'
                fwlist.append(fw)

        text = driver.find_element_by_xpath("/html/body/div[2]/div/div[2]/div[4]").text
        fw = text.split(' ',3)
        fwlist.append(fw[3])
        print(fwlist)
        driver.close()

with open('Scaping\PGE-FW.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(fw)

driver.quit()

对于测试,我使用一个工作 IP 和一个非工作 IP,但我看到工作的一个在 30 秒内负载正常,但它总是会出现异常。

下面是带有等待 ID 的 html 图像和我想要的数据的 xpath。我真的不知道如何为大家提供调制解调器内部网页的 html,但如果有人有办法,请告诉我。所以我想我有两个问题:我是否犯了一些明显的错误,为什么这不起作用,或者也许有更好的方法来解决这个问题?

在此处输入图像描述

标签: pythonseleniumselenium-webdriver

解决方案


忘记导入了。

Needed:
from selenium.webdriver.support import expected_conditions as EC

From:
WebDriverWait(driver,30,.5).until(presence_of_element_located((By.ID,"login_screen")))

To:
WebDriverWait(driver,30,.5).until(EC.presence_of_element_located((By.ID,"login_screen")))

推荐阅读