python - How to select the first child of each element in list in Beautiful Soup
问题描述
I want to get the text from the first inner div in each outer div
<body>
<div class="outer">
<div class="inner">text1</div>
<div class="inner">text2</div>
<div class="inner">text3</div>
</div>
<div class="outer">
<div class="inner">text4</div>
</div>
<div class="outer">
<div class="inner">text5</div>
<div class="inner">text6</div>
</div>
</body>
This is means retrieving text1, text4, text5
I've experimented with the code shown below:
outers = soup.select('body > .outer')
for outer in outers:
inners = outer.select_one('.inner')
for inner in inners:
print(inner.text)
But can't get it to work
解决方案
可能这行得通,
soup = BeautifulSoup(text, 'html.parser')
for outer in soup.find_all('div', class_='outer'):
inners = outer.find('div', class_='inner')
for inner in inners:
print(inner)
# Output as:
# text1
# text4
# text5
或者您可以使用这种方式,
soup = BeautifulSoup(text, 'html.parser')
for outer in soup.find_all('div', class_='outer'):
inners = outer.find('div', class_='inner')
print(inners.get_text())
推荐阅读
- python - 根据列值删除行中的重复项
- woocommerce - 使用 woocommerce 进行 Paypal 和数字下载:PDT 还是 IPN?或者是否有关于如何实现两者的教程?
- spotfire - 在 spotfire 的文本区域中显示的信息链接参数
- python - 如何让 Atom 看到正确的 Python 版本的虚拟环境?
- r - 是否可以将 SQL 代码转换为 R 代码?
- java - 模拟在java中被多次调用的方法
- c++ - 如何工作 std::condition_variable::wait_until
- performance - 装配 strlen AVX512BW 优化和加速
- bash - 如何在使用的同时增强内置的 shell 功能
- c# - 自定义按钮异常