python - 具有相同类的 div 的 Beautifulsoup 打印属性值
问题描述
我有以下代码工作,将在 value= 之后打印文本
soup = BeautifulSoup(html, 'lxml')
name = soup.find('input')['value']
print(name)
但是,该页面有多个具有相同类的 div 我尝试过 findAll 但我收到错误并且只能打印第一个字段值,即名称。
请参阅随附的屏幕截图
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
谢谢!
解决方案
也许是这样的:
from bs4 import BeautifulSoup
html = '''
<html>
<head></head>
<body>
<div class="control-group">
<label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls">
<input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required class="input-small text-bound datepicker hasDatepicker">
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(html, "lxml")
items = soup.select('.controls')
print([item.text.strip() for item in items if item.text.strip()])
推荐阅读
- rust - 如果由原子操作门控,非原子写入是否可以安全读取?
- excel - 即使关闭,在网络上发布的 Google 表格也会继续计算
- qt - Qt/PySide:链接两个小部件的悬停高亮状态
- python - 有没有办法让我返回我的代码并执行第二个 def 函数?
- python-3.x - 终端没有在顶部 Mac 上打开窗口 - Atom
- java - 我应该在导入包时使用完全合格的吗?它会引起任何副作用吗?
- vue.js - 使用 babel-plugin-import 时 antd nuxtjs 的文档
- python - Python Linearmodels:如何让 Python 知道这些是标识 Group 的 ID 列?
- git - 无法在 jenkins 上运行 git,错误代码 13,权限被拒绝
- php - 使用带有 AJAX 的 pdf.js