首页 > 解决方案 > 从漂亮的汤对象中提取文本时出现类型错误

问题描述

lines我有一个用 span 类调用的 bs4 元素。我正在尝试获取文本,但遇到如下类型错误

行包括:

[<span class="lt-line-clamp__line">I'm excited to be entering a new phase of my career at Xyz!</span>,
 <span class="lt-line-clamp__line"></span>,
 <span class="lt-line-clamp__line lt-line-clamp__line--last">
       I'm a program manager, product development leader, and business strategist who is passionate about delive<span class="lt-line-clamp__ellipsis">...
             <a aria-expanded="false" class="lt-line-clamp__more" data-test-line-clamp-show-more-button="true" href="#" id="line-clamp-show-more-button" role="button">see more</a>
 </span></span>]

代码:

lines = about.select('span.lt-line-clamp__line')  # this lines consists of above input
about = ''.join([line.find(text=True, recursive=False) for line in lines])

错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-104-45a217b0c18f> in <module>
      1 lines = about.select('span.lt-line-clamp__line')
----> 2 about = ''.join([line.find(text=True, recursive=False) for line in lines])

TypeError: sequence item 1: expected str instance, NoneType found

 

标签: pythonbeautifulsoup

解决方案


findNone如果找不到文本,可能会返回。

试试这个代码。找不到文本时会跳过行。

about = ''.join([line.find(text=True, recursive=False) for line in lines if line.find(text=True, recursive=False)])

推荐阅读