python - Where is from "ERROR: Spider error processing
I am reading a log from a previous spider's launching. I am curious to know where is from this exception and how I can act on it:
2019-04-12
问题描述
I am reading a log from a previous spider's launching. I am curious to know where is from this exception and how I can act on it:
2019-04-12 22:00:55 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.website.com/next_page> (referer: https://www.website.com/prev_page)
Traceback (most recent call last):...
I looked at the files middlewares.py
, settings.py
and so on in my project and I do not find any lines where it is written logging.error
or spider.logger.error
. Even in the built-in methods def process_spider_exception(self, response, exception, spider):
or def process_exception(self, request, exception, spider):
I do not find any line that orders a log message. Looking at the documentation does not clarify it, as for me.
Now about to act on it. If I would like to know where it is from, is because I would like to try to insert some lines that orders to add the urls in a file dedicated to some kind of exceptions that make rise an spider error processing to analyze it, correct it, and launch the spider again on these specific urls from this file because that's more comfortable than from a scrapy log file.
Beyond the wish of acting on it, I would like to know where it is and how it works.
To answer your question, that log messsage is coming from handle_spider_error method in scrapy package
Regarding finding source of error, hints are usually traceback that comes along with this error log.
You can also follow code that call this url 'https://www.website.com/next_page'
解决方案
要回答您的问题,该日志消息来自 scrapy 包中的 handle_spider_error 方法
关于查找错误来源,提示通常是与此错误日志一起提供的回溯。
您还可以按照调用此 URL 的代码“ https://www.website.com/next_page ”
推荐阅读
- javascript - 如何将移动地理位置标记添加到自动更新位置的 Folium 网络地图中?
- python-3.x - 修复从 PIL 图像到 OpenCV Mat 的低效图像转换
- django - django related_name 如何与特殊关键字 / 一起使用
- hotkeys - wshShell.SendKeys(Chr(&hAF)) 不起作用
- java - 在使用杰克逊进行序列化时合并两个类属性
- ios - 为什么我的按钮上的 addTarget 功能不起作用?
- javascript - 为什么在尝试迭代图像列表时指定的图像不会在点击时被删除?
- spss - 将 t-1 的记录值分配给所有案例
- javascript - jQuery 音频淡入淡出
- coq - 在 Coq 中区分目标
I am reading a log from a previous spider's launching. I am curious to know where is from this exception and how I can act on it:
2019-04-12
问题描述
I am reading a log from a previous spider's launching. I am curious to know where is from this exception and how I can act on it:
2019-04-12 22:00:55 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.website.com/next_page> (referer: https://www.website.com/prev_page)
Traceback (most recent call last):...
I looked at the files middlewares.py
, settings.py
and so on in my project and I do not find any lines where it is written logging.error
or spider.logger.error
. Even in the built-in methods def process_spider_exception(self, response, exception, spider):
or def process_exception(self, request, exception, spider):
I do not find any line that orders a log message. Looking at the documentation does not clarify it, as for me.
Now about to act on it. If I would like to know where it is from, is because I would like to try to insert some lines that orders to add the urls in a file dedicated to some kind of exceptions that make rise an spider error processing to analyze it, correct it, and launch the spider again on these specific urls from this file because that's more comfortable than from a scrapy log file.
Beyond the wish of acting on it, I would like to know where it is and how it works.
To answer your question, that log messsage is coming from handle_spider_error method in scrapy package
Regarding finding source of error, hints are usually traceback that comes along with this error log.
You can also follow code that call this url 'https://www.website.com/next_page'
解决方案
要回答您的问题,该日志消息来自 scrapy 包中的 handle_spider_error 方法
关于查找错误来源,提示通常是与此错误日志一起提供的回溯。
您还可以按照调用此 URL 的代码“ https://www.website.com/next_page ”
推荐阅读
- javascript - 如何将移动地理位置标记添加到自动更新位置的 Folium 网络地图中?
- python-3.x - 修复从 PIL 图像到 OpenCV Mat 的低效图像转换
- django - django related_name 如何与特殊关键字 / 一起使用
- hotkeys - wshShell.SendKeys(Chr(&hAF)) 不起作用
- java - 在使用杰克逊进行序列化时合并两个类属性
- ios - 为什么我的按钮上的 addTarget 功能不起作用?
- javascript - 为什么在尝试迭代图像列表时指定的图像不会在点击时被删除?
- spss - 将 t-1 的记录值分配给所有案例
- javascript - jQuery 音频淡入淡出
- coq - 在 Coq 中区分目标