首页 > 解决方案 > HTTP CONNECT + GET 返回错误状态

问题描述

我通过仅支持 HTTP(不支持 HTTPS)的代理发送 GET 请求。当我使用该代理(或任何其他仅 http 的代理)请求 HTTPS 时,当我使用 curl 时它返回 403(它似乎是正确的状态)。但如果我只使用 CONNECT 和 GET,我会得到 200。

卷曲 - 403 禁止:

curl -x proxyhost:proxyport -I https://example.com -vvv

*   Trying PROXYHOST:8080...
* TCP_NODELAY set
* Connected to PROXYHOST (PROXYHOST) port 8080 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to www.example.com:443
> CONNECT www.example.com:443 HTTP/1.1
> Host: www.example.com:443
> User-Agent: curl/7.68.0
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 403 Forbidden
HTTP/1.1 403 Forbidden
< Date: Fri, 15 Oct 2021 15:37:31 GMT
Date: Fri, 15 Oct 2021 15:37:31 GMT
< Server: Apache
Server: Apache
< Content-Length: 202
Content-Length: 202
< Content-Type: text/html; charset=iso-8859-1
Content-Type: text/html; charset=iso-8859-1
< 

* Received HTTP code 403 from proxy after CONNECT
* CONNECT phase completed!
* Closing connection 0
curl: (56) Received HTTP code 403 from proxy after CONNECT

纯 HTTP - 200 OK:

CONNECT PROXYHOST:PROXYPORT HTTP/1.0
GET https://www.example.com:443 HTTP/1.0
    
HTTP/1.0 200 OK

为什么我用 CONNECT + GET 得到 200?

附加信息: PROXYHOST 只是一个没有任何代理设置或软件的随机主机。它只是发生了,当您将它用作 http GET 请求的代理时,它会返回正确的状态(如果请求的页面存在,则返回 200,如果不存在,则返回 404,等等)和它自己的 html 而不是请求的正文。同时,如果您尝试将其用作代理以通过除 CONNECT + GET 之外的任何方式请求 https,它总是返回 403。

我还尝试了 python 请求并得到了这个结果(带有详细日志):

proxy = {'https': 'http://PROXYHOST:8080', 'http': 'http://PROXYHOST:8080'}

requests.get('https://example.com', proxies=proxy)
# logs
send: b'CONNECT example.com:443 HTTP/1.0\r\n'
send: b'\r\n'
# exception
File "/usr/lib/python3.8/http/client.py", line 276, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response / (Caused by ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response')))

requests.get('http://example.com', proxies=proxy)
# logs
send: b'GET http://example.com/ HTTP/1.1\r\nHost: example.com\r\nUser-Agent: python-requests/2.25.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Fri, 15 Oct 2021 17:10:00 GMT
header: ...

<Response [200]>

标签: socketshttpcurlproxypython-requests

解决方案


推荐阅读