python - BeautifulSoup - 在标签后获取文本
问题描述
解决方案
我想这就是你要找的。它找到父 p 元素,将汤对象转换为字符串,删除强元素,然后将字符串转换回汤对象。
from bs4 import BeautifulSoup
soup = BeautifulSoup("<p><strong>High School Honors: </strong><em>Parade </em>All-American; <em>Chicago Sun-Times </em>Illinois Player of the Year honors; rushed for 2,100 yards and 31 TDs as a senior; led team to 14-0 record and Class 4A State Championship as a junior with 1,820 yards and 26 TDs; also lettered in baseball.</p>", 'html.parser')
headerList = []
infoList = []
for strong_tag in soup.findAll('strong'):
parent = strong_tag.find_parent('p')
content = str(parent).replace(f'{strong_tag}', '')
souped_content = BeautifulSoup(content, 'html.parser')
infoList.append(souped_content)
headerList.append(strong_tag)
print(headerList)
print(infoList)
这将输出以下内容:
[<strong>High School Honors: </strong>]
[<p><em>Parade </em>All-American; <em>Chicago Sun-Times </em>Illinois Player of the Year honors; rushed for 2,100 yards and 31 TDs as a senior; led team to 14-0 record and Class 4A State Championship as a junior with 1,820 yards and 26 TDs; also lettered in baseball.</p>]
推荐阅读
- javascript - reCAPTCHA:JavaScript 方法不适用于属性
- rust - 是否可以从将引用作为参数的 rust dll 导出函数
- reactjs - 在 React.js 中使用 new Array() 命令。将 typeof() 应用于变量时,结果是“对象”
- mysql - MySQL - 匹配某些 ID,但仅匹配那些 ID
- css - 对象不会像我想要的那样从屏幕的一侧移动到另一侧
- android - 当返回的类型与预期不匹配时,如何使 Retrofit 反序列化对 null 对象的响应?
- typescript - 仅从打字稿中的接口中提取标量字段
- c++ - .debug$S 部分的签名
- html - 当复选框位于无序列表中时,如何使复选框看起来像一个按钮
- javascript - Discord.JS,使用 Canvas 在 For 循环中显示多个用户个人资料图片