首页 > 解决方案 > 如何使用 selenium chrome Web 驱动程序自动化登录凭据

问题描述

我正在尝试从 [this][1] 网站提取数据:

手动程序是在搜索框中输入“CCOCCO”等字符串,单击“预测属性”并从表中记录“玻璃化转变温度 (K)”。

如果 html POST 的数量小于 5,以下代码将自动执行上述任务:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options 
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options=Options()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver=webdriver.Chrome(chrome_options=options)

def get_glass_temperature(smiles):
    driver.get('https://www.polymergenome.org/explore/index.php?m=1')
    x_path_click="//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']"
    x_path_find="//input[@class='dark_blue_button_no_round' and @value='Predict Properties']"
    x_path_get="//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font"
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_click))).send_keys(smiles)
    driver.find_element_by_xpath(x_path_find).click()
    return WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH,x_path_get))).get_attribute("innerHTML")

我将上述函数应用于具有类似于“CCOCCO”的字符串的 tp 400 值的 pandas 数据帧。但是,在返回 5 "Glass Temperature" 后会出现 WebdriverException 错误,因为网站会抛出以下消息:

"Visits of more than 5 times per day to the property prediction capability requires login. "

在运行代码之前,我登录网站并选中“记住我”框,但错误是一样的。

我试图修改代码如下:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options 
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd 
import os 

options=Options()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver=webdriver.Chrome(chrome_options=options, executable_path='/Users/ae/Downloads/chromedriver')

def get_glass_temperature(smiles):
    driver.get('https://www.polymergenome.org/explore/index.php?m=1')
    user_name='my_user_name'
    password='my_password'
    x_path_id="//input[@class='large_input_no_round' and @placeholder='User ID']"
    x_path_pass="//input[@class='large_input_no_round' and @placeholder='Password']"
    x_path_sign="//input[@class='orange_button_no_round' and @value='Sign In']"
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_id))).send_keys(user_name)
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_pass))).send_keys(password)
    driver.find_element_by_xpath(x_path_sign).click()

    x_path_click="//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']"
    x_path_find="//input[@class='dark_blue_button_no_round' and @value='Predict Properties']"
    x_path_get="//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font"
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_click))).send_keys(smiles)
    driver.find_element_by_xpath(x_path_find).click()
    return WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH,x_path_get))).get_attribute("innerHTML")

test_smiles=['CC(F)(F)CC(F)(F)','CCCCC(=O)OCCOC(=O)','CNS-C6H3-CSN-C6H3','CCOCCO','NH-CS-NH-C6H4','C4H8','C([*])C([*])(COOc1cc(Cl)ccc1)']
test_polymer=pd.DataFrame({'SMILES': test_smiles})
test_polymer['test_tg']=test_polymer['SMILES'].apply(get_glass_temperature)
print (test_polymer)

修改后,我收到超时错误:

Traceback (most recent call last):
  File "/Users/alieftekhari/Desktop/extract_TG.py", line 42, in <module>
    test_polymer['test_tg']=test_polymer['SMILES'].apply(get_glass_temperature)
  File "/anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 3194, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/src/inference.pyx", line 1472, in pandas._libs.lib.map_infer
  File "/Users/user/Desktop/extract_TG.py", line 22, in get_glass_temperature
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_id))).send_keys(user_name)
  File "/anaconda/lib/python2.7/site-packages/selenium/webdriver/support/wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
  [1]: https://www.polymergenome.org/explore/index.php?m=1

标签: pythonseleniumselenium-webdriverselenium-chromedriver

解决方案


查看堆栈跟踪的最后一行File "/anaconda/lib/python2.7/site-packages/selenium/webdriver/support/wait.py", line 80, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:

它清楚地提到没有这样的元素,这就是它给出 TimeoutException 的原因。我在这里看到的,你的 xpath 是错误的..

x_path_id="//input[@class='large_input_no_round ui-autocomplete-input' and @placeholder='User ID']"
x_path_pass="//input[@class='large_input_no_round ui-autocomplete-input' and @placeholder='Password']"

没有类large_input_no_round ui-autocomplete-input,所以用正确的类修改 xpath,如下所示..

x_path_id="//input[@class='large_input_no_round' and @placeholder='User ID']"
x_path_pass="//input[@class='large_input_no_round' and @placeholder='Password']"

问题

  • driver.get('https://www.polymergenome.org/explore/index.php?m=1')此页面没有登录窗口,因此出现 TimeoutExceptionWebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, x_path_id))).send_keys(user_name)

    换句话说,当您运行脚本时,它会启动一个新的浏览器实例,意味着您之前的登录已经消失,现在您需要登录才能通过此限制Visits of more than 5 times per day to the property prediction capability requires login.;并且登录窗口将在 5 次成功提取迭代后填充,这里的脚本失败是因为它试图直接登录而不等待登录对话框,并且由于没有登录窗口,它给出了 TimeoutException。

解决方案是你应该将提取数据部分放入try块并登录到catch,只有在提取数据出现异常时才会执行登录部分。我的 Java 实现是这样的,

@Test(invocationCount = 7)
    public void getList(){
        wait = new WebDriverWait(driver, 20);
        By locator = By.xpath("//table[@class='record']//tbody/tr[@class='record']//following::td[7]/center/font/font");
        try {
            driver.findElement(By.xpath("//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']")).clear();
            driver.findElement(By.xpath("//input[@class='large_input_no_round ui-autocomplete-input' and @id='keyword_original']")).sendKeys("CCOCCO");
            driver.findElement(By.xpath("//input[@class='dark_blue_button_no_round' and @value='Predict Properties']")).click();
            String text = wait.until(ExpectedConditions.visibilityOfElementLocated(locator)).getAttribute("innerHTML");
            System.out.println(text);
        }catch(Exception e){
            System.out.println("In Exception Block");
            wait.until(ExpectedConditions.elementToBeClickable(By.xpath("//input[@class='large_input_no_round' and @placeholder='User ID']")));
            driver.findElement(By.xpath("//input[@class='large_input_no_round' and @placeholder='User ID']")).sendKeys("testing");
            driver.findElement(By.xpath("//input[@class='large_input_no_round' and @placeholder='Password']")).sendKeys("testing");
            driver.findElement(By.xpath("//input[@class='orange_button_no_round' and @value='Sign In']")).click();
        }
    }       

其他方式

  • 最好的方法是浏览网站,导航到登录对话框,然后登录,成功登录后,浏览搜索页面并继续提取。
  • 或者您可以在登录前设置 5 个限制(意味着提取 5 次)。

推荐阅读