python-3.x - How to efficiently extract the most inner content inside this class?
问题描述
I want to replace the value of href
with the inner value of the class lienarticle
in the following text
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925">mono</a>
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925"><i>aimer</i></a>
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925"><b>you</b></a>
My method of achieving my goal is rudimentary as follows
from bs4 import BeautifulSoup
text = '''
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925">mono</a>
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925"><i>aimer</i></a>
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925"><b>you</b></a>
'''
soup = BeautifulSoup(text, 'html.parser')
for a in soup.select('.lienarticle'):
a['href'] = 'entry://' + str(a.contents[0]).replace('<b>', '').replace('</b>', '').replace('<i>', '').replace('</i>', '')
The desired result is
<a class="lienarticle" href="entry://mono">mono</a>
<a class="lienarticle" href="entry://aimer"><i>aimer</i></a>
<a class="lienarticle" href="entry://you"><b>you</b></a>
I would like to ask for a more efficient way to do so, not just replacing string as mine. Thank you so much!
解决方案
这是一种使用方法的.text
方法
前任:
from bs4 import BeautifulSoup
text = '''
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925">mono</a>
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925"><i>aimer</i></a>
<a class="lienarticle" href="/dictionnaires/francais/aimer/1925"><b>you</b></a>
'''
soup = BeautifulSoup(text, 'html.parser')
for a in soup.select('.lienarticle'):
a['href'] = f'entry://{a.text}'
print(a)
输出:
<a class="lienarticle" href="entry://mono">mono</a>
<a class="lienarticle" href="entry://aimer"><i>aimer</i></a>
<a class="lienarticle" href="entry://you"><b>you</b></a>
推荐阅读
- amazon-web-services - 如果启用 MFA 的移动设备丢失或损坏,将如何登录 aws 帐户?
- javascript - 使用 localhost 向 API 发送 POST 请求
- node.js - Nodejs REST API 函数抛出错误“回调不是函数”
- tcl - 如何在 tcl 中使用 sed 和 printf 定义文本区域
- datatable - 减少 RMarkdown 数据表之间的空间
- python - 如何使用 selenuim 解析列值及其 href
- react-native - 当我从 React Native Elements 放置表单元素时出错
- plotly - 以节点红色绘制
- javascript - 如何将对象列表转换为键控数组/对象?
- ios - Xamarin.iOS 应用程序在 iOS 13 上冻结