首页 > 解决方案 > 不能让脚本定期运行

问题描述

我正在尝试以某种方式修改以下脚本,以便它定期运行。我知道如何使用请求来做同样的事情。但是,在硒的情况下,我被卡住了。

我试过了

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

link = 'https://stackoverflow.com/questions/tagged/web-scraping'

def get_content(link):
    driver.get(link)
    for item in WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,".question-summary"))):
        title = item.find_element_by_css_selector(".question-hyperlink").text
        link = item.find_element_by_css_selector(".question-hyperlink").get_attribute("href")
        print(title,link)
    driver.quit()

if __name__ == '__main__':
    driver = webdriver.Chrome()
    while True:
        get_content(link)
        time.sleep(20)

如何使脚本定期运行?

如果我按原样运行,我会在第二次尝试中收到以下错误:

    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=51356): Max retries exceeded with url: /session/41bae2407c029ad2879619c3e65552da/url (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x02504850>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

标签: pythonpython-3.xseleniumselenium-webdriverweb-scraping

解决方案


Fn:获取内容;使用 driver.quit。

所以,最后它关闭了驱动程序。这意味着,在下一次运行中,您需要一个新的 Web 驱动程序实例。

def get_content(link, driver):
    driver.get(link)
    ... .. .
    driver.quit()

if __name__ == '__main__':
    while True:
        driver = webdriver.Chrome()
        get_content(link, driver)
        time.sleep(20)

推荐阅读