首页 > 解决方案 > 从 http://ip.zscaler.com/ 获取公共 IP 和其他基于文本的信息

问题描述

通过这 2 个 Python 命令,我可以轻松获得公共 IP。

>>> get('https://ident.me').text
'1.2.3.4'
>>>

>>> urllib.request.urlopen('https://ident.me').read().decode('utf8')
'1.2.3.4'
>>>

但是,当我将 URL 从https://ident.me更改为http://ip.zscaler.com/时,我得到了太多不必要的 HTML 信息。

我只对以下基于文本的信息感兴趣,如下面的屏幕截图所示。

测试代理 1 测试代理 1

测试代理 2 在此处输入图像描述

测试代理 3 在此处输入图像描述

是否可以从http://ip.zscaler.com/仅获取重要的基于文本的信息并删除其他不必要的 HTML 标记?

期望的输出

>>> get('http://ip.zscaler.com/').text
The request received from you did not have an XFF header, so you are quite likely not going through the Zscaler proxy service.
Your request is arriving at this server from the IP address x.x.x.x
Your Gateway IP Address is most likely x.x.x.x
>>>

>>> urllib.request.urlopen('http://ip.zscaler.com/').read().decode('utf8')
The request received from you did not have an XFF header, so you are quite likely not going through the Zscaler proxy service.
Your request is arriving at this server from the IP address x.x.x.x
Your Gateway IP Address is most likely x.x.x.x
>>>

标签: pythonpython-3.x

解决方案


使用BeautifulSouprequests

from bs4 import BeautifulSoup
from requests import get

URL = "http://ip.zscaler.com/"

# GET request to url 
request = get(URL).text

# Create parser
soup = BeautifulSoup(request, features="html.parser")

# Print out headline
headline = soup.find("div", attrs={"class": "headline"})
print(headline.text)

# Print out details
for detail in soup.find_all("div", attrs={"class": "details"}):
    print(detail.text)

这给出了以下输出:

The request received from you did not have an XFF header, so you are quite likely not going through the Zscaler proxy service.
Your request is arriving at this server from the IP address 119.17.136.170
Your Gateway IP Address is most likely 119.17.136.170

推荐阅读