首页 > 解决方案 > Python lxml打印每个表格行

问题描述

当我在“cn”中输入内容时,脚本将在网站上进行查询并给我多行表格

from lxml import html
from lxml import etree
from lxml.etree import XPath
import requests

cn = input ('CN: ')

find_page = requests.get('search query' + cn + '')
tree = html.fromstring(find_page.content)

# //tr[2]/td[2]/a/text() is first row after <th>
com = tree.xpath('//tr[2]/td[2]/a/text()')

print ('COM:', com)

此代码仅在 XPath 位置 //tr[2] 上打印我表中的第一行,但我需要打印所有其他表行//tr[3]/td[2]/a/text() //tr[4]/td[2]/a/text() //tr[...]/td[2]/a/text()

编辑:

在解决了从表中获取所有项目之后,我得到了结果,例如COM: ['DAP', 'DAPA', 'DAP FOOD']所有这些都有 href。我只能在第一个链接 (DAP) 上访问和抓取,但不能从 (DAPA 和 DAP FOOD) 抓取

from lxml import html
from lxml import etree
from lxml.etree import XPath
import requests

cn = input ('CN: ')

find_page = requests.get('search query' + cn + '')
tree = html.fromstring(find_page.content)
    
# //tr[2]/td[2]/a/text() is first row after <th>
com = tree.xpath('//tr/td[2]/a/text()')

link = tree.xpath('//tr/td[2]/a/@href')[0]
link = str(link)

com_link = ('website' + link)
page = requests.get(com_link)
tree = html.fromstring(page.content)

postal_code = tree.xpath('//span[@itemprop="postalCode"]/text()')[0]

print ('COM:', com)
print ('Postal Code', postal_code)

我如何访问 DAP、DAPA、DAP FOOD 并从中获取 postal_code?

标签: pythonpython-3.xlxml

解决方案


更改com = tree.xpath('//tr[2]/td[2]/a/text()')com = tree.xpath('//tr/td[2]/a/text()')并且可以正常工作

from lxml import html
from lxml import etree
from lxml.etree import XPath
import requests

cn = input ('CN: ')

find_page = requests.get('search query' + cn + '')
tree = html.fromstring(find_page.content)

# //tr[2]/td[2]/a/text() is first row after <th>
com = tree.xpath('//tr/td[2]/a/text()')

print ('COM:', com)

推荐阅读