首页 > 解决方案 > 为什么输出只返回带有scrapy的最后一个URL的数据?

问题描述

大家好,我有这个问题,我想在同一个域中抓取多个 url 并将结果保存在 json 文件中,但输出只返回最后一个 url 的结果的 n 倍。

也许一个例子可以帮助我解释。

使用真实代码更新这是我的代码:

import scrapy

class Test(scrapy.Spider):
    name= "testscraper"
    allowed_domains=['ebird.org']
    start_urls=[
            'https://ebird.org/species/ostric2',
            'https://ebird.org/species/ostric3',
            'https://ebird.org/species/y00934', 
            'https://ebird.org/species/grerhe1', 
            'https://ebird.org/species/lesrhe2'
    ]
    def start_requests(self):
        for url in self.start_urls:
            print('---------------------------')
            print(url)
            print('---------------------------')
            yield scrapy.Request(url=url,callback=self.parse,dont_filter=True)
 

    def parse(self,response):
        print('***************************')
        print(response.url)
        print('***************************')
        image = response.css('img').xpath('@src').get()
        code = response.url[-7::]
        common_name=response.xpath('//span[@class="Heading-main Media--hero-title"]//text()').get()
        scientific_name=response.xpath('//span[@class="Heading-sub Heading-sub--sci Heading-sub--custom u-text-4-loose"]//text()').get()
        description=response.xpath('//p[@class="u-stack-sm"]/text()').get()
        if description:
            description=description.split('\n',1)[0]

        yield {
            'code':code,
            'scientific_name':scientific_name,
            'common_name':common_name,
            'description':description,
            'image':image,
            'url':response.url
        }

所以当我运行时:

scrapy crawl testscraper -O testscraper.json

文件 testscraper.json 有:

[
{"code": "lesrhe2", "scientific_name": "Rhea pennata", "common_name": "Lesser Rhea", "description": "This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.", "image": "https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800", "audio": "assetId\":524686", "url": "https://ebird.org/species/lesrhe2"},
{"code": "lesrhe2", "scientific_name": "Rhea pennata", "common_name": "Lesser Rhea", "description": "This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.", "image": "https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800", "audio": "assetId\":524686", "url": "https://ebird.org/species/lesrhe2"},
{"code": "lesrhe2", "scientific_name": "Rhea pennata", "common_name": "Lesser Rhea", "description": "This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.", "image": "https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800", "audio": "assetId\":524686", "url": "https://ebird.org/species/lesrhe2"},
{"code": "lesrhe2", "scientific_name": "Rhea pennata", "common_name": "Lesser Rhea", "description": "This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.", "image": "https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800", "audio": "assetId\":524686", "url": "https://ebird.org/species/lesrhe2"},
{"code": "lesrhe2", "scientific_name": "Rhea pennata", "common_name": "Lesser Rhea", "description": "This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.", "image": "https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800", "audio": "assetId\":524686", "url": "https://ebird.org/species/lesrhe2"}
]

最后一个 dict 但 5 次,每个 url 一个。

我正在寻求帮助,有人建议使用以下设置:

DUPEFILTER_CLASS = 'scrapy.dupefilters.BaseDupeFilter'

但仍然无法正常工作。

我通常不寻求帮助,但我真的不明白发生了什么。也许是一件愚蠢的事情,但我并没有真正看到它。如果您知道发生了什么,请给我一个提示。

实际设置

DOWNLOAD_DELAY = 30
DUPEFILTER_CLASS = 'scrapy.dupefilters.BaseDupeFilter'
CONCURRENT_REQUESTS_PER_DOMAIN=1

和日志:

2021-08-17 00:32:16 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: ebird)
2021-08-17 00:32:16 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.8.5 (default, Sep  4 2020, 07:30:14) - [GCC 7.3.0], pyOpenSSL 20.0.1 (OpenSSL 1.1.1k  25 Mar 2021), cryptography 3.4.7, Platform Linux-5.4.0-80-generic-x86_64-with-glibc2.10
2021-08-17 00:32:16 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2021-08-17 00:32:16 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'ebird',
 'CONCURRENT_REQUESTS_PER_DOMAIN': 1,
 'DOWNLOAD_DELAY': 10,
 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter',
 'NEWSPIDER_MODULE': 'ebird.spiders',
 'ROBOTSTXT_OBEY': True,
 'SPIDER_MODULES': ['ebird.spiders']}
2021-08-17 00:32:16 [scrapy.extensions.telnet] INFO: Telnet Password: 00b83b5e4e0bedd7
2021-08-17 00:32:16 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2021-08-17 00:32:16 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2021-08-17 00:32:16 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2021-08-17 00:32:16 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2021-08-17 00:32:16 [scrapy.core.engine] INFO: Spider opened
2021-08-17 00:32:16 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-08-17 00:32:16 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
---------------------------
https://ebird.org/species/ostric2
---------------------------
---------------------------
https://ebird.org/species/ostric3
---------------------------
---------------------------
https://ebird.org/species/y00934
---------------------------
---------------------------
https://ebird.org/species/grerhe1
---------------------------
---------------------------
https://ebird.org/species/lesrhe2
---------------------------
2021-08-17 00:32:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://ebird.org/robots.txt> (referer: None)
2021-08-17 00:32:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en> from <GET https://ebird.org/species/ostric2>
2021-08-17 00:32:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://secure.birds.cornell.edu/robots.txt> (referer: None)
2021-08-17 00:32:40 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en> from <GET https://ebird.org/species/ostric3>
2021-08-17 00:32:57 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en> from <GET https://ebird.org/species/y00934>
2021-08-17 00:33:01 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en> from <GET https://ebird.org/species/grerhe1>
2021-08-17 00:33:15 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en> from <GET https://ebird.org/species/lesrhe2>
2021-08-17 00:33:16 [scrapy.extensions.logstats] INFO: Crawled 2 pages (at 2 pages/min), scraped 0 items (at 0 items/min)
2021-08-17 00:33:28 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/login/cas?portal=ebird> from <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en>
2021-08-17 00:33:41 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/login/cas?portal=ebird> from <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en>
2021-08-17 00:33:54 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/login/cas?portal=ebird> from <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en>
2021-08-17 00:34:05 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/login/cas?portal=ebird> from <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en>
2021-08-17 00:34:16 [scrapy.extensions.logstats] INFO: Crawled 2 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-08-17 00:34:19 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/login/cas?portal=ebird> from <GET https://secure.birds.cornell.edu/cassso/login?service=https%3A%2F%2Febird.org%2Flogin%2Fcas%3Fportal%3Debird&gateway=true&locale=en>
2021-08-17 00:34:32 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/ebird/species/lesrhe2> from <GET https://ebird.org/login/cas?portal=ebird>
2021-08-17 00:34:42 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/ebird/species/lesrhe2> from <GET https://ebird.org/login/cas?portal=ebird>
2021-08-17 00:34:53 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/ebird/species/lesrhe2> from <GET https://ebird.org/login/cas?portal=ebird>
2021-08-17 00:35:14 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/ebird/species/lesrhe2> from <GET https://ebird.org/login/cas?portal=ebird>
2021-08-17 00:35:15 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/ebird/species/lesrhe2> from <GET https://ebird.org/login/cas?portal=ebird>
2021-08-17 00:35:16 [scrapy.extensions.logstats] INFO: Crawled 2 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-08-17 00:35:21 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/species/lesrhe2> from <GET https://ebird.org/ebird/species/lesrhe2>
2021-08-17 00:35:34 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/species/lesrhe2> from <GET https://ebird.org/ebird/species/lesrhe2>
2021-08-17 00:35:51 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/species/lesrhe2> from <GET https://ebird.org/ebird/species/lesrhe2>
2021-08-17 00:36:03 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/species/lesrhe2> from <GET https://ebird.org/ebird/species/lesrhe2>
2021-08-17 00:36:16 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://ebird.org/species/lesrhe2> from <GET https://ebird.org/ebird/species/lesrhe2>
2021-08-17 00:36:16 [scrapy.extensions.logstats] INFO: Crawled 2 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-08-17 00:36:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://ebird.org/species/lesrhe2> (referer: None)
***************************
https://ebird.org/species/lesrhe2
***************************
2021-08-17 00:36:28 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ebird.org/species/lesrhe2>
{'code': 'lesrhe2', 'scientific_name': 'Rhea pennata', 'common_name': 'Lesser Rhea', 'description': 'This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.', 'image': 'https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800', 'audio': 'assetId":524686', 'url': 'https://ebird.org/species/lesrhe2'}
2021-08-17 00:36:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://ebird.org/species/lesrhe2> (referer: None)
***************************
https://ebird.org/species/lesrhe2
***************************
2021-08-17 00:36:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ebird.org/species/lesrhe2>
{'code': 'lesrhe2', 'scientific_name': 'Rhea pennata', 'common_name': 'Lesser Rhea', 'description': 'This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.', 'image': 'https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800', 'audio': 'assetId":524686', 'url': 'https://ebird.org/species/lesrhe2'}
2021-08-17 00:36:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://ebird.org/species/lesrhe2> (referer: None)
***************************
https://ebird.org/species/lesrhe2
***************************
2021-08-17 00:36:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ebird.org/species/lesrhe2>
{'code': 'lesrhe2', 'scientific_name': 'Rhea pennata', 'common_name': 'Lesser Rhea', 'description': 'This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.', 'image': 'https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800', 'audio': 'assetId":524686', 'url': 'https://ebird.org/species/lesrhe2'}
2021-08-17 00:37:16 [scrapy.extensions.logstats] INFO: Crawled 5 pages (at 3 pages/min), scraped 3 items (at 3 items/min)
2021-08-17 00:37:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://ebird.org/species/lesrhe2> (referer: None)
***************************
https://ebird.org/species/lesrhe2
***************************
2021-08-17 00:37:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ebird.org/species/lesrhe2>
{'code': 'lesrhe2', 'scientific_name': 'Rhea pennata', 'common_name': 'Lesser Rhea', 'description': 'This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.', 'image': 'https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800', 'audio': 'assetId":524686', 'url': 'https://ebird.org/species/lesrhe2'}
2021-08-17 00:37:32 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://ebird.org/species/lesrhe2> (referer: None)
***************************
https://ebird.org/species/lesrhe2
***************************
2021-08-17 00:37:32 [scrapy.core.scraper] DEBUG: Scraped from <200 https://ebird.org/species/lesrhe2>
{'code': 'lesrhe2', 'scientific_name': 'Rhea pennata', 'common_name': 'Lesser Rhea', 'description': 'This flightless South American relative of the Ostrich stands about 5 feet tall with a body about the size of a sheep; no similar species in its range. Rheas roam widely on open Patagonian steppe and also occur locally in open habitats of the Andes, mainly at very high elevations. Can be confiding where used to people, but in other areas wary, running strongly and quickly. Rheas occur singly or in groups, and males take care of the young. Adults have bold pale spots on the body, first-year birds are plainer overall.', 'image': 'https://cdn.download.ams.birds.cornell.edu/api/v1/asset/115691341/1800', 'audio': 'assetId":524686', 'url': 'https://ebird.org/species/lesrhe2'}
2021-08-17 00:37:32 [scrapy.core.engine] INFO: Closing spider (finished)
2021-08-17 00:37:32 [scrapy.extensions.feedexport] INFO: Stored json feed (5 items) in: birdscraper.json
2021-08-17 00:37:32 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 7672,
 'downloader/request_count': 27,
 'downloader/request_method_count/GET': 27,
 'downloader/response_bytes': 373670,
 'downloader/response_count': 27,
 'downloader/response_status_count/200': 7,
 'downloader/response_status_count/302': 20,
 'elapsed_time_seconds': 315.86914,
 'feedexport/success_count/FileFeedStorage': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 8, 17, 4, 37, 32, 599873),
 'httpcompression/response_bytes': 1367567,
 'httpcompression/response_count': 6,
 'item_scraped_count': 5,
 'log_count/DEBUG': 32,
 'log_count/INFO': 16,
 'memusage/max': 66494464,
 'memusage/startup': 56147968,
 'response_received_count': 7,
 'robotstxt/request_count': 2,
 'robotstxt/response_count': 2,
 'robotstxt/response_status_count/200': 2,
 'scheduler/dequeued': 25,
 'scheduler/dequeued/memory': 25,
 'scheduler/enqueued': 25,
 'scheduler/enqueued/memory': 25,
 'start_time': datetime.datetime(2021, 8, 17, 4, 32, 16, 730733)}
2021-08-17 00:37:32 [scrapy.core.engine] INFO: Spider closed (finished)

我注意到重定向,所以我尝试用“dont_redirect”更改它,然后脚本打印正确的 url 但显示错误,因为蜘蛛没有进入页面,所以无法获取任何字段。

标签: pythonscrapy

解决方案


在使用curl和重定向您感兴趣的域 (ebird.org) 时,很明显需要 cookie 才能最终正确解析重定向。但是,回收相同的 cookie 会话(默认的 scrapy 行为)似乎会导致您看到的奇怪重定向行为

解决方法是为每个使用不同的 cookie 会话Request

    ...

    def start_requests(self):
        for i, url in enumerate(self.start_urls):
            print("---------------------------")
            print(url)
            print("---------------------------")
            yield scrapy.Request(
                url=url, callback=self.parse, dont_filter=True, meta={"cookiejar": i}
            )

    ...

请注意,如果您稍后从每个启动请求中添加后续请求,则每次都需要显式重新附加 cookie- meta={'cookiejar': response.meta['cookiejar']}jar

另请参阅此答案


推荐阅读