首页 > 解决方案 > Python BeautifulSoup 如何定位跨度?

问题描述

我一直试图在 HTML 页面上找到一个跨度,它不起作用。有人可以给我代码吗?十分感谢。

<div ng-repeat="m in messages" ng-if="hasMessage(m.message)" class="message-box success" ng-class="{ 'error': m.type == 'error', 'success': m.type == 'success', 'info': m.type == 'info', 'promotion': m.type == 'promotion' }"> <span ng-bind-html="m.message">Congratulations! Your $60 discount has been applied, enjoy $20 off your first 3 boxes.</span> <!----> </div>

我试过这段代码

soup = BeautifulSoup(r.text)
badges = soup.body.find('span', attrs={'class': 'message-box'})
for span in badges.span.find_all('span', recursive=False):
    print(span.attrs['title'])

我想得到60美元的部分。

标签: pythonhtmlbeautifulsoup

解决方案


您可以使用 BeautifulSoup 选择句子,但要获得该$60部分,您必须使用其他技术,例如re模块:

txt = '''<div ng-repeat="m in messages" ng-if="hasMessage(m.message)" class="message-box success" ng-class="{ 'error': m.type == 'error', 'success': m.type == 'success', 'info': m.type == 'info', 'promotion': m.type == 'promotion' }"> <span ng-bind-html="m.message">Congratulations! Your $60 discount has been applied, enjoy $20 off your first 3 boxes.</span> <!----> </div>'''

import re
from bs4 import BeautifulSoup

soup = BeautifulSoup(txt, 'html.parser')

text = soup.select_one('span[ng-bind-html="m.message"]').text

print( re.search(r'(\$\d+)', text).group(1) )

印刷:

$60

推荐阅读