python - Looping through html with beautiful soup in Python
问题描述
I'm trying to loop through a html table.
On the page I'm looking through there is only one table. So that's easy to locate. Under that there are several <tr>
s, and I want to look through these apart from some headers defined by <th>
instead of <td>
s. Each <tr>
consists of several different classed in the <td>
s. I'm only looking to collect the two <td>
's with class="table-name" and the <td>
with the class="table-score".
I have tried to work with:
rows = html.find("table", class_="table").find_all("tr")
for row in rows:
if row.find("th") is None:
td_names = row.findall("td")
for td_name in td_names:
print(td_name)
But I'm really having any success with that.
So basically the html looks something like this:
<table>
<tr>
<th>Header</th>
</tr>
<tr>
<td class="table-rank">1</td>
<td class="table-name">John</td>
<td class="table-name">Jim</td>
<td class="table-place">Russia</td>
<td class="table-score">2-1</td>
</tr>
</table>
I'm only looking for "John", "Jim", "2-1".
Thanks in advance.
解决方案
find_all() will return a list of all elements matching the filter. You can use index of the list to choose the element you need. 0 for first, 1 for second etc.
from bs4 import BeautifulSoup
html="""
<table>
<tr>
<th>Header</th>
</tr>
<tr>
<td class="table-rank">1</td>
<td class="table-name">John</td>
<td class="table-name">Jim</td>
<td class="table-place">Russia</td>
<td class="table-score">2-1</td>
</tr>
</table>
"""
soup=BeautifulSoup(html,'html.parser')
our_tr=soup.find('table').find_all('tr')[1] #the second tr in the table - index starts at 0
#print all td's of seconf tr
our_tds=our_tr.find_all('td')
print(our_tds[1].text)
print(our_tds[2].text)
print(our_tds[4].text)
Output
John
Jim
2-1
推荐阅读
- mysql - 有没有办法在 MySQL 数据库记录中设置特殊字符?
- spring-boot - 在 JPA 中使用 findBy 时如何忽略某些列
- android - 编写 Kotlin 扩展函数时出错
- c# - 区分 C# vsto 中自定义安装的字体和 Office 应用程序内置(默认)字体
- javascript - 如何使 JavaScript 在 Wordpress 正文中工作?需要包起来吗?
- node.js - 在nodejs mqtt重新连接中没有收到以前的消息
- apache-kafka - Flink 数据流转换和暴露到 REST 端点
- r - 在 R 中,如何找到列中包含任何值的行
- android - 检查来自json的条件
- python-3.x - 如何使用 pandas 获取数据库中单词的计数