首页 > 解决方案 > 使用 Selenium/Python 旋转代理

问题描述

所以我在 GitHub 上找到了这段代码,用于从https://free-proxy-list.net/收集 IP并轮换它们。但是当我尝试运行它时会收到一条错误消息。

我正在使用 ChromeDriver 2.41,因为我第一次收到关于 Socks 整数的不同错误。使用 ChromeDriver 2.41 已经解决了这个问题,但我仍然无法通过这个 'pxy' 参考。

另外,pycharm 提醒我'pd' 有一个重新声明的定义而没有使用。对于“pxy”和“pd”错误,我真的很感激!

这是代码:

from selenium import webdriver
from selenium.webdriver.chrome.options import DesiredCapabilities
from selenium.webdriver.common.proxy import Proxy, ProxyType

import time


co = webdriver.ChromeOptions()
co.add_argument("log-level=3")
co.add_argument("--headless")

def get_proxies(co=co):
    driver = webdriver.Chrome(chrome_options=co)
    driver.get("https://free-proxy-list.net/")

    PROXIES = []
    proxies = driver.find_elements_by_css_selector("tr[role='row']")
    for p in proxies:
        result = p.text.split(" ")

        if result[-1] == "yes":
            PROXIES.append(result[0]+":"+result[1])

    driver.close()
    return PROXIES


ALL_PROXIES = get_proxies()


def proxy_driver(PROXIES, co=co):
    prox = Proxy()

    if PROXIES:
        pxy = PROXIES[-1]
    else:
        print("--- Proxies used up (%s)" % len(PROXIES))
        PROXIES = get_proxies()

    prox.proxy_type = ProxyType.MANUAL
    prox.http_proxy = pxy
    prox.socks_proxy = pxy
    prox.ssl_proxy = pxy

    capabilities = webdriver.DesiredCapabilities.CHROME
    prox.add_to_capabilities(capabilities)

    driver = webdriver.Chrome(chrome_options=co, desired_capabilities=capabilities)

    return driver



# --- YOU ONLY NEED TO CARE FROM THIS LINE ---
# creating new driver to use proxy
pd = proxy_driver(ALL_PROXIES)

# code must be in a while loop with a try to keep trying with different proxies
running = True

while running:
    try:
        mycodehere()

        # if statement to terminate loop if code working properly
        something()

        # you 
    except:
        new = ALL_PROXIES.pop()

        # reassign driver if fail to switch proxy
        pd = proxy_driver(ALL_PROXIES)
        print("--- Switched proxy to: %s" % new)

这是我得到的错误:

Traceback (most recent call last):
  File "scripts2.py", line 65, in <module>
    pd = proxy_driver(ALL_PROXIES)
  File "scripts2.py", line 53, in proxy_driver
    prox.http_proxy = pxy
UnboundLocalError: local variable 'pxy' referenced before assignment

我有点困惑,因为我认为“pxy”被分配在 if PROXIES 下?

标签: pythonseleniumselenium-webdriverproxy

解决方案


尝试更改以下行

当前代码:

if PROXIES:
    pxy = PROXIES[-1]  # script will fail if this condition not met
else:
    print("--- Proxies used up (%s)" % len(PROXIES))
    PROXIES = get_proxies()

更新代码:

# make sure to reset pxy to either null or empty
pxy = '' 
if (PROXIES is None):
    #print("--- Proxies used up (%s)" % len(PROXIES))
    PROXIES = get_proxies()
pxy = PROXIES[-1]
# I would check if pxy is empty or not before doing assignment
if (pxy!=''):
   #Then do the logic here

推荐阅读