首页 > 解决方案 > 站点解析 myip.ms

问题描述

为该站点编写解析器https://myip.ms/此处为该页面https://myip.ms/browse/sites/1/ipID/23.227.38.0/ipIDii/23.227.38.255/own/376714一切正常这个链接很好,但如果你去另一个页面https://myip.ms/browse/sites/2/ipID/23.227.38.0/ipIDii/23.227.38.255/own/376714它不输出任何数据,虽然网站结构是一样的。我认为这可能是由于该网站对浏览量有限制,或者因为您需要注册,但我找不到您需要发送什么请求才能登录您的帐户。告诉我该怎么做?

import requests
from bs4 import BeautifulSoup
import time
link_list = []

URL = 'https://myip.ms/browse/sites/2/ipID/23.227.38.0/ipIDii/23.227.38.255/own/376714'

HEADERS = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 YaBrowser/20.12.2.105 Yowser/2.5 Safari/537.36','accept':'*/*'}
#HOST =
def get_html(url,params=None):
    r = requests.get(url,headers=HEADERS,params=params)
    return r

def get_content(html):
    soup = BeautifulSoup(html,'html.parser')
    items = soup.find_all('td',class_='row_name')
    for item in items:
        links = item.find('a').get('href')
        link_list.append({
            'link': links
        })

def parser():
    print(URL)

    html = get_html(URL)
    if html.status_code == 200:
        get_content(html.text)
    else:
        print('Error')

parser()
print(link_list)

标签: pythonparsing

解决方案


推荐阅读