首页 > 解决方案 > 如何将 bs4 findall() 对象转换为字符串

问题描述

这是我的代码:

with requests.Session() as s:
      r = s.get(url, headers=headers)
      soup = BeautifulSoup(r.text, 'html.parser')
      sizes = soup.findAll(True,{'class':'product__sizes-size-1'})

我想将尺寸变成一个字符串对象而不是一个标签,这样我就可以完成

parsed_sizes = [item for item in sizes if 1 <= item <= 20]

这需要一个字符串来比较现在打印尺寸输出:

[<span class="product__sizes-size-1">6</span>, <span class="product__sizes-size-1">6.5</span>, <span class="product__sizes-size-1">7</span>, <span class="product__sizes-size-1">7.5</span>, <span class="product__sizes-size-1">8</span>, <span class="product__sizes-size-1">8.5</span>, <span class="product__sizes-size-1">9</span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>]

如果我这样做,type() 我会得到<class'bs4.element.ResultSet'>

标签: pythonhtmldatabasebeautifulsoup

解决方案


您需要获取标签文本,转换为数字,然后它应该可以工作。

例如:

from bs4 import BeautifulSoup

sizes = """[<span class="product__sizes-size-1">6</span>, <span class="product__sizes-size-1">6.5</span>, <span class="product__sizes-size-1">7</span>, <span class="product__sizes-size-1">7.5</span>, <span class="product__sizes-size-1">8</span>, <span class="product__sizes-size-1">8.5</span>, <span class="product__sizes-size-1">9</span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>, <span class="product__sizes-size-1"></span>]
"""

soup = BeautifulSoup(sizes, "html.parser").find_all(True, {'class': 'product__sizes-size-1'}, text=True)
parsed_sizes = [
    item.getText(strip=True) for item in soup 
    if 1 <= float(item.getText(strip=True)) <= 20
]
print(parsed_sizes)

输出:

['6', '6.5', '7', '7.5', '8', '8.5', '9']

推荐阅读