首页 > 解决方案 > request 模块有效,但 FormRequest 无效

问题描述

我正在努力学习 Scrapy。我试图在 Scrapy 中复制以下发布请求,但没有成功。我也试过scrapy.Request(method='POST')了,但也没有用。

import requests, json

headers = {
'accept': '*/*',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'content-length': '132',
'content-type': 'application/x-www-form-urlencoded',
'origin': 'https://www.autozone.com',
'referer': 'https://www.autozone.com/miscellaneous-non-automotive/jump-starter',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
'x-requested-with': 'XMLHttpRequest'
}

url = 'https://www.autozone.com/rest/bean/autozone/diy/commerce/pricing/PricingServices/retrievePriceAndAvailability?atg-rest-depth=2'

data = {
'arg1': '9801',
'arg2': '',
'arg3': '824997',
'arg4': ''
}

response = requests.post(url, headers=headers, data=data, timeout=5)

info = json.loads(response.text)
print(info['atgResponse'][0]['retailPrice']) # prints 129.99

刮痧壳:

> r = scrapy.FormRequest(url, formdata=data, headers=headers)
> fetch(r) # Doesn't work

谁能指出我哪里出错了

编辑1:

这是scrapy的堆栈跟踪。希望这可以帮助。

>>> fetch(r)
2020-02-15 15:00:08 [scrapy.core.engine] INFO: Spider opened
2020-02-15 15:03:08 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST https://www.autozone.com/rest/bean/autozone/diy/commerce/pricing/PricingServices/retrievePriceAndAvailability?atg-rest-depth=2> (failed 1 times): User timeout caused connection failure: Getting https://www.autozone.com/rest/bean/autozone/diy/commerce/pricing/PricingServices/retrievePriceAndAvailability?atg-rest-depth=2 took longer than 180.0 seconds..

它重试几次,然后失败。

谢谢。

标签: pythonscrapypython-requests

解决方案


我尝试访问您的链接,但它返回了此错误Access to the requested resource is not allowed: /autozone/diy/commerce/pricing/PricingServices,因此我怀疑您的请求中需要Authorization标头或会话 cookie,您没有提供也没有放置占位符。缺少这些可能导致超时。


推荐阅读