I am reading a log from a previous spider's launching. I am curious to know where is from this exception and how I can act on it:

2019-04-12 ,python,logging,scrapy,error-logging"/>
	














首页 > 解决方案 > Where is from "ERROR: Spider error processing

I am reading a log from a previous spider's launching. I am curious to know where is from this exception and how I can act on it:

2019-04-12 

问题描述

I am reading a log from a previous spider's launching. I am curious to know where is from this exception and how I can act on it:

2019-04-12 22:00:55 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.website.com/next_page> (referer: https://www.website.com/prev_page)
Traceback (most recent call last):...

I looked at the files middlewares.py, settings.py and so on in my project and I do not find any lines where it is written logging.error or spider.logger.error. Even in the built-in methods def process_spider_exception(self, response, exception, spider): or def process_exception(self, request, exception, spider): I do not find any line that orders a log message. Looking at the documentation does not clarify it, as for me.

Now about to act on it. If I would like to know where it is from, is because I would like to try to insert some lines that orders to add the urls in a file dedicated to some kind of exceptions that make rise an spider error processing to analyze it, correct it, and launch the spider again on these specific urls from this file because that's more comfortable than from a scrapy log file.

Beyond the wish of acting on it, I would like to know where it is and how it works.


To answer your question, that log messsage is coming from handle_spider_error method in scrapy package

core/scraper.py

Regarding finding source of error, hints are usually traceback that comes along with this error log.

You can also follow code that call this url 'https://www.website.com/next_page'

标签: pythonloggingscrapyerror-logging

解决方案


要回答您的问题,该日志消息来自 scrapy 包中的 handle_spider_error 方法

核心/刮板.py

关于查找错误来源,提示通常是与此错误日志一起提供的回溯。

您还可以按照调用此 URL 的代码“ https://www.website.com/next_page


推荐阅读