首页 > 解决方案 > 需要获取包含在类中的文本

问题描述

我需要从每个中捕获文本,直到匹配结束。这个标题将在 html 中多次更改。我可以捕获所有数据,但有更好的方法吗?

<div class="box">
    <a class="visual" href="https://www.example.com">
        <img src="https://www.example.com/img.jpg" alt="image description">
        <h2>Ventura</h2>
    </a>
    <div class="status-row">
        <div class="service">
            <span class="icon nowork"></span> No work                                   
        </div>
        <div class="work">
            <div class="number">0</div> Planned Work
        </div>
    </div>
</div>

    <div class="box">
    <a class="visual" href="https://www.example.com">
        <img src="https://www.example.com/img.jpg" alt="image description">
        <h2>Boston</h2>
    </a>
    <div class="status-row">
        <div class="service">
            <span class="icon disruption"></span> Disruptions                                   
        </div>
        <div class="no-work">
            <div class="number">0</div> No Work
        </div>
    </div>
</div>

    page = requests.get(url,verify=False)
soup = BeautifulSoup(page.text, 'html.parser')

s = 'Ventura'

for x in soup.findAll("div",  {"class": ["box", "status-row"]}):
    z = x.get_text()
    if  s in z.strip():
        print(z)

有一个更好的方法吗?

标签: python-3.xbeautifulsoup

解决方案


推荐阅读