首页 > 解决方案 > 网页拒绝连接

问题描述

大家好,我正在尝试使用 BeautifulSoup 进行一些 Web Scraping,在这种情况下,我收到此错误:

ConnectionRefusedError    Traceback (most recent call last)
urllib.error.URLError: <urlopen error [Errno 10061] No connection could be made
because the target machine actively refused it>

这是我的代码:

from urllib.request import urlopen
from bs4 import BeautifulSoup
import base64
import pytesseract as pyt
import requests
from PIL import Image
import matplotlib.pyplot as ptl
import numpy as np

pyt.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'


login_url = 'http://www.root-me.org/?page=login&lang=fr'
payload = {
          'var_login': 'email',
          'password': 'pass'
           }

with requests.Session() as s:
    response = requests.post(login_url , payload)
    scrap_url= urlopen('http://challenge01.root-me.org/programmation/ch8/')
    soup = BeautifulSoup(scrap_url)
    img = soup.find('img')['src'].split(',')[1]
    with open('captcha.png', 'wb') as guardar:
        decodificar = base64.b64decode(img)
        guardar.write(decodificar)
    

leer_img = Image.open('captcha.png', 'r')
ptl.imshow(np.asarray(leer_img))
texto_captcha = pyt.image_to_string(leer_img)
print(texto_captcha)

问题是当我登录这个页面时,我得到了验证码,然后在注销后我收到了上面描述的错误。有什么建议么?

标签: python

解决方案


推荐阅读