首页 > 解决方案 > 网络抓取问题(空列表)

问题描述

我目前正在尝试获取玩家的排名,但总是返回一个空列表。挣扎了一段时间,非常感谢一些帮助和未来项目解决这些问题的任何提示。或者总体上更好地理解beautifulsoup的地方

import requests as req
from bs4 import BeautifulSoup

id = "epic"
tag = "random"
url = f"https://rocketleague.tracker.network/rocket-league/profile/{id}/{tag.lower()}/overview"
html = req.get(url).content
soup = BeautifulSoup(html,"lxml")
line = soup.findAll("div",{"class":"rank"})
print(line)

这是我想要得到的:

显示所需元素的屏幕截图

标签: pythonweb-scraping

解决方案


在这里加载响应requests不起作用,仅仅是因为站点动态加载内容,因此您需要使用一些webdriver.

这是我想出的:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.add_argument("--headless")
driver = webdriver.Firefox(executable_path=r"driver/geckodriver.exe", options=options)

id = "epic"
tag = "random"
url = f"https://rocketleague.tracker.network/rocket-league/profile/{id}/{tag.lower()}/overview"

driver.get(url)
driver.implicitly_wait(2) #allow some time to fully load, you may tweak accordingly
ranks = driver.find_elements_by_css_selector(r'[class="rank"]')

for i in ranks:
    print(i.text)
driver.quit()

导致:

#695,273 • Top 32%
#1,409,786 • Bottom 26%
#1,240,839 • Bottom 43%
#1,195,373 • Bottom 45%
#875,794 • Top 41%
#874,338 • Top 41%
#1,530,195 • Bottom 29%
#960,411 • Top 45%
Unranked Division I
#1,663,447 • Top 33%
Gold III Division III
#2,974,386 • Bottom 29%
Platinum II Division III
#2,879,016 • Bottom 40%
Platinum II Division II
#2,363,741 • Top 50%
Gold III Division III
#2,407,466 • Bottom 42%
Platinum II Division I
#2,054,002 • Top 48%
Gold II Division II
#2,321,030 • Bottom 41%

推荐阅读