首页 > 解决方案 > 如何从表中刮取第二列

问题描述

我正在尝试从表的第二列中抓取数据但失败了...

这是我的代码:

import bs4
import requests 
url = "https://en.wikipedia.org/wiki/List_of_postcode_districts_in_the_United_Kingdom"`

data=requests.get(url)
soup=bs4.BeautifulSoup(data.text,'html.parser')
My_table = soup.find('table',{'class':'wikitable sortable'})
#print(My_table)
My_row = My_table.find_all('tr')
#print(My_row[1])
for row in My_row:
   data= (row.find('td')[1].text)
   print(data)

这是错误:

TypeError:“int”对象不可下标

什么是最好的解决方案?

标签: pythonweb-scrapingbeautifulsoup

解决方案


这段代码似乎工作

import bs4
import requests

url = "https://en.wikipedia.org/wiki/List_of_postcode_districts_in_the_United_Kingdom"

data = requests.get(url)
soup = bs4.BeautifulSoup(data.text, 'html.parser')
table = soup.find('table', {'class': 'wikitable sortable'})
rows = table.find_all('tr')
for i, row in enumerate(rows):
    if i > 0:
        for j, td in enumerate(row.children):
            if j == 3:
                print(td.text.strip())

推荐阅读