首页 > 解决方案 > 如何防止 Geopy 出现此速率限制器错误?

问题描述

我有一个充满英国邮政编码的数据框。我有大约 400 行,想要获取这些邮政编码的地理编码,以便以后绘制它们。我使用了以下指南,因此不确定是什么导致了错误:

https://practicaldatascience.co.uk/data-science/how-to-geocode-and-map-addresses-in-geopy

我有以下代码。我正在使用的数据框只是一个 1 列长的数据框,其中包含来自虚拟数据集的英国邮政编码。

import pandas as pd

import folium

import geopy
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

geocoder = RateLimiter(Nominatim(user_agent='Get_Lat_Longs').geocode, min_delay_seconds=1)

df = pd.read_excel('Postcodes.xls', sheet_name='Addresses formatted')

df_copy = df.copy()

df_postcodes = df_copy['Postcode'].to_frame()
df_postcodes['Geocode'] = df_postcodes['Postcode'].apply(geocoder)

但是,我收到以下错误,我不太确定如何调试我所做的事情,任何帮助将不胜感激。

RateLimiter caught an error, retrying (0/2 tries). Called with (*('N20 0PE',), **{}).
Traceback (most recent call last):
  File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 696, in urlopen
    self._prepare_proxy(conn)
  File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 964, in _prepare_proxy
    conn.connect()
  File "c:\users\np\env\lib\site-packages\urllib3\connection.py", line 364, in connect
    conn = self._connect_tls_proxy(hostname, conn)
  File "c:\users\np\env\lib\site-packages\urllib3\connection.py", line 507, in _connect_tls_proxy
    ssl_context=ssl_context,
  File "c:\users\np\env\lib\site-packages\urllib3\util\ssl_.py", line 453, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
  File "c:\users\np\env\lib\site-packages\urllib3\util\ssl_.py", line 495, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock)
  File "C:\Program Files\Python37\lib\ssl.py", line 423, in wrap_socket
    session=session
  File "C:\Program Files\Python37\lib\ssl.py", line 870, in _create
    self.do_handshake()
  File "C:\Program Files\Python37\lib\ssl.py", line 1139, in do_handshake
    self._sslobj.do_handshake()
socket.timeout: _ssl.c:1074: The handshake operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\np\env\lib\site-packages\requests\adapters.py", line 449, in send
    timeout=timeout
  File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 796, in urlopen
    **response_kw
  File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 796, in urlopen
    **response_kw
  File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "c:\users\np\env\lib\site-packages\urllib3\util\retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /search?q=N20+0PE&format=json&limit=1 (Caused by ProxyError('Cannot connect to proxy.', timeout('_ssl.c:1074: The handshake operation timed out')))

标签: pythonpandasgeolocationgeopy

解决方案


问题是我试图在虚拟机中执行此操作。检查给出的评论后,我能够确定在虚拟机内部,请求没有被发送到网站,但是在我的本地机器上,这不是问题,我能够获取所有内容的地理编码.


推荐阅读