首页 > 解决方案 > 从 GitHub (urllib.request) 下载 csv 文件时出错

问题描述

我已经在这个脚本上工作了几个星期,我刚刚完成,一切都运行得很好。我返回脚本并尝试运行它,但我遇到了一个错误并且知道它的含义以及如何解决它。当我下载一些包含 CoronaVirus 数据的 csv 文件时,脚本很早就出现了错误。我正在使用 urllib.request 来执行此操作,但出现错误

  File "Run.py", line 13, in <module>
    urllib.request.urlretrieve(url, filename="time_series_covid19_recoveredGlobal.csv")
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1362, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1322, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

还有一个

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1319, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1230, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1276, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1225, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1004, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 944, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1392, in connect
    super().connect()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 915, in connect
    self.sock = self._create_connection(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/socket.py", line 787, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

我想知道为什么会这样。这是我所拥有的:

import urllib.request
url = "https://raw.githubuserxxxent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv"
urllib.request.urlretrieve(url, filename="time_series_covid19_recoveredGlobal.csv")
url = "https://raw.githubuserxxxent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv"
urllib.request.urlretrieve(url, filename="time_series_covid19_confirmed_global.csv")
url = "https://raw.githubuserxxxent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv"
urllib.request.urlretrieve(url, filename="time_series_covid19_deaths_global.csv")

谢谢你!

标签: pythoncsvurllib

解决方案


您的网址有错别字:

而不是域是githubuserxxxent.com,它们应该是githubusercontent.com

请注意,尝试在浏览器中访问它时会githubuserxxxent.com返回错误。Address not found

也许您在代码中进行了查找和替换,这就是发生这种情况的原因。

固定代码:

import urllib.request
url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv"
urllib.request.urlretrieve(url, filename="time_series_covid19_recoveredGlobal.csv")
url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv"
urllib.request.urlretrieve(url, filename="time_series_covid19_confirmed_global.csv")
url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv"
urllib.request.urlretrieve(url, filename="time_series_covid19_deaths_global.csv")

推荐阅读