首页 > 解决方案 > lxml 将返回空的 lisy 但 requests_html 将返回想要的结果

问题描述

我写了两个代码,一个带有请求,另一个带有 request_html 的 lxml。带有 requests 和 lxml 的代码将返回一个空列表,而带有 request_thml 的代码将返回想要的结果。

代码1:

from lxml import html

url = 'https://www.amazon.com/OnePlus-Glacial-Unlocked-Android-Smartphone/dp/B08723759H/ref=sr_1_1?dchild=1&keywords=oneplus&qid=1624692877&sr=8-1'

page = requests.get(url)

content = html.fromstring(page.text)

price = content.xpath('//*[@id="productTitle"]')

print(price)


#output is always 
#[]

代码 2:

from requests_html import HTMLSession

url = 'https://www.amazon.com/OnePlus-Glacial-Unlocked-Android-Smartphone/dp/B08723759H/ref=sr_1_1?dchild=1&keywords=oneplus&qid=1624692877&sr=8-1'

session = HTMLSession()

r = session.get(url)

r.html.render(sleep=1)

title = r.html.xpath('//*[@id="productTitle"]', first=True)

print(title)

#output

#OnePlus 8 Glacial Green,​ 5G Unlocked Android Smartphone U.S Version, 8GB RAM+128GB Storage, 90Hz Fluid Display,Triple Camera, with Alexa Built-in,

标签: python

解决方案


推荐阅读