首页 > 解决方案 > Scrapy - 即使添加标题后也无法解决 403 错误

问题描述

我正在尝试刮doordash.com。但每次我运行请求时,它都会显示 403 和这条线INFO: Ignoring response <403 http://doordash.com/>: HTTP status code is not handled or not allowed

我尝试了很多事情,比如添加 User-Agent,但仍然没有用。我还添加了完整的标题,但同样的事情又发生了。这是我的代码:

class DoordashSpider(scrapy.Spider):
    name = 'doordash'
    allowed_domains = ['doordash.com']
    start_urls = ['http://doordash.com/']
    

    def start_requests(self):
        headers= {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36',
                    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                    'Accept-Language': 'en-US,en;q=0.9',
                    'Accept-Encoding': 'gzip, deflate, br'}
        for url in self.start_urls:
            yield scrapy.Request(url, headers=headers)

    def parse(self, response):
        print('Crawled Successfully')

如何获得200?

标签: web-scrapingscrapy

解决方案


推荐阅读