首页 > 解决方案 > 运行 Python 代码以从 Ubuntu 中提取 Url 时出现 UrlError

问题描述

下面是 Ubuntu 终端的堆栈跟踪即使我的 anacinda 也需要太多时间才能打开(大约 20 分钟)

  Traceback (most recent call last):
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 1317, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1244, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1290, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1239, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1406, in connect
    super().connect()
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 938, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/home/narendra/anaconda3/lib/python3.7/socket.py", line 727, in create_connection
    raise err
  File "/home/narendra/anaconda3/lib/python3.7/socket.py", line 716, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "TASK_1.py", line 23, in <module>
    response = urllib.request.urlopen(line,context=gcontext)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 543, in _open
    '_open', req)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 1360, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 110] Connection timed out>

下面是我的代码。

此代码将 Url 数据提取到文件中

它将一一挑选 url.txt 的 URL

然后它将从该特定 URL 中提取所有页面数据。

import urllib.request, urllib.error, urllib.parse
import io
import ssl
#localhost, 127.0.0.0/8, ::1, 10.0.0.0/8
# using readline() that reads file line by line. 
file1 = open("url.txt", "r") 
count = 0
gcontext = ssl.SSLContext()`

对于范围内的 i (18):计数 += 1

   # Getting the  next line from file 
   line = file1.readline() 
   # if line is empty 
   # end of file is reached 
   if not line: 
      break
   response = urllib.request.urlopen(line,context=gcontext)
   webContent = response.read()
   with io.open("file_" + str(i) + ".txt", 'w', encoding='utf-8') as f:
       f.write(webContent)
       f.close()

标签: pythonubuntuurlanaconda

解决方案


推荐阅读