javascript - 如何用 Python requests-html 抓取评分?
问题描述
我在使用 requests-html 来掌握网站上的评级信息时遇到了困难。这是我写的代码:
from requests_html import HTMLSession
import requests
from bs4 import BeautifulSoup
import re
url="https://www.immobilienscout24.de/expose/107160613/"
session=HTMLSession()
r=session.get(url)
r.html.render()
rating=r.html.find("div#style__truncateChild___2Z9XG is24-rating",first=False)
print(rating)
这里用于评级信息的网站 html 如下:
但是,我只能收到错误消息:
Traceback (most recent call last):
File "D:/Program Files/python/draft.py", line 8, in <module>
r.html.render()
File "E:\master\thesis\thesis\venv\lib\site-packages\requests_html.py", line 583, in render
content, result, page = self.session.loop.run_until_complete(_async_render(url=self.url, script=script, sleep=sleep, wait=wait, content=self.html, reload=reload, scrolldown=scrolldown, timeout=timeout, keep_page=keep_page))
File "D:\Program Files\python\lib\asyncio\base_events.py", line 568, in run_until_complete
return future.result()
File "E:\master\thesis\thesis\venv\lib\site-packages\requests_html.py", line 545, in _async_render
await page.goto(url, options={'timeout': int(timeout * 1000)})
File "E:\master\thesis\thesis\venv\lib\site-packages\pyppeteer\page.py", line 854, in goto
result = await self._navigate(url, referrer)
File "E:\master\thesis\thesis\venv\lib\site-packages\pyppeteer\page.py", line 869, in _navigate
'Page.navigate', {'url': url, 'referrer': referrer})
pyppeteer.errors.NetworkError: Protocol error Page.navigate: Target closed.
我期望的是掌握相关的评级信息:3 Sterne。
解决方案
我意识到这已经很老了,但是我能够使用异步和设置超时来获得一些东西:
from requests_html import AsyncHTMLSession
s = AsyncHTMLSession()
async def main():
r = await s.get('https://www.immobilienscout24.de/expose/107160613/')
await r.html.arender(timeout=60)
print(r.html.find('span[class*=rating]'))
s.run(main)
[<Element 'span' class=('overall-rating', 'margin-right-s') title='4,2 Sterne'>,
<Element 'span' class=('overall-rating', 'margin-right-s') title='4,2 Sterne'>,
<Element 'span' class=('overall-rating', 'margin-right-s') title='4,2 Sterne'>,
<Element 'span' class=('overall-rating', 'margin-right-s') title='4,2 Sterne'>]
推荐阅读
- mysql - 重写 SQL 查询以修复 MySQL 5.7 严格模式导致的功能依赖问题
- java - 如何从json数组中删除值?
- javascript - 仅在 Firefox 中出现 react-sortable-hoc 网格布局问题(在 safari、edge 和 chrome 中运行良好)
- sql-server - 将 .csv 加载到在 Docker 容器上运行的 SQL Server
- text-to-speech - 如何在 Tensorboard 中查看 Mozilla TTS 训练性能?
- java - 假设是 CLI 或 Swing 接口客户端,如何通过 Liberty 访问远程 Bean
- matlab - 如何用 Matlab 绘制和定义在不同子区间上定义的函数,该函数进入 ODE
- html - Wordpress:有序列表块的 CSS 覆盖不读取起始值
- arrays - 为什么 scanf 正在运行我的第二个输入?
- angular - 如何设置 Angular 通用模块以抓取 Angular 11 应用程序?