首页 > 解决方案 > Scrapy shell 请求与响应不匹配?

问题描述

以下是我在 PyCharm 终端中运行的代码:

scrapy shell "https://www.puppis.com.ar/perros/alimentos/alimentos-secos#2"

输出:

Available Scrapy objects:
[s]   scrapy     scrapy module (contains scrapy.Request, scrapy.Selector, etc)
[s]   crawler    <scrapy.crawler.Crawler object at 0x000001F4B3F77AF0>
[s]   item       {}
[s]   request    <GET https://www.puppis.com.ar/perros/alimentos/alimentos-secos#2>
[s]   response   <200 https://www.puppis.com.ar/perros/alimentos/alimentos-secos>
[s]   settings   <scrapy.settings.Settings object at 0x000001F4B3F77160>
[s]   spider     <DefaultSpider 'default' at 0x1f4b4442d30>
[s] Useful shortcuts:
[s]   fetch(url[, redirect=True]) Fetch URL and update local objects (by default, redirects are followed)
[s]   fetch(req)                  Fetch a scrapy.Request and update local objects
[s]   shelp()           Shell help (print this help)
[s]   view(response)    View response in a browser

为什么响应输出与请求 URL 不同?

据我了解,带有“#2”的页面被重定向到主页 - 最后没有“#2”。有什么办法可以避免这种情况发生吗?

标签: pythonscrapy

解决方案


推荐阅读