python - 从漂亮的汤对象中提取文本时出现类型错误
问题描述
lines
我有一个用 span 类调用的 bs4 元素。我正在尝试获取文本,但遇到如下类型错误
行包括:
[<span class="lt-line-clamp__line">I'm excited to be entering a new phase of my career at Xyz!</span>,
<span class="lt-line-clamp__line"></span>,
<span class="lt-line-clamp__line lt-line-clamp__line--last">
I'm a program manager, product development leader, and business strategist who is passionate about delive<span class="lt-line-clamp__ellipsis">...
<a aria-expanded="false" class="lt-line-clamp__more" data-test-line-clamp-show-more-button="true" href="#" id="line-clamp-show-more-button" role="button">see more</a>
</span></span>]
代码:
lines = about.select('span.lt-line-clamp__line') # this lines consists of above input
about = ''.join([line.find(text=True, recursive=False) for line in lines])
错误:
TypeError Traceback (most recent call last)
<ipython-input-104-45a217b0c18f> in <module>
1 lines = about.select('span.lt-line-clamp__line')
----> 2 about = ''.join([line.find(text=True, recursive=False) for line in lines])
TypeError: sequence item 1: expected str instance, NoneType found
解决方案
find
None
如果找不到文本,可能会返回。
试试这个代码。找不到文本时会跳过行。
about = ''.join([line.find(text=True, recursive=False) for line in lines if line.find(text=True, recursive=False)])
推荐阅读
- c - 用于 PE (x86) 的 Windows dll
- java - 是否可以使 JToggleButton 取消选择其他 JToggleButton 以具有与传统单选按钮类似的效果?
- c++ - 为什么函数对实际参数的副本进行操作?
- c# - 为 nunit3-console.exe 指向 SUT 的 DLL 和 settings.json 的路径
- matlab - Matlab - 如何在不丢失行名的情况下对条形图的行进行排序?
- jquery - 结束功能上的 jQuery 视频 -> 如何重置视频?
- php - 我如何显示来自路径 Laravel 的图像?
- python - 将 .jpg 文件保存到同名文件夹中
- javascript - Angular 8:Visual Studio 代码调试器是否正确捕获主题行为?
- java - 从 Mono 解开字符串