python - scrapy splash:连接被对方拒绝:61:连接被拒绝
问题描述
我一直在尝试使用 splash 运行 scrapy 以提取 javascript 呈现的数据。Splash 通过以下命令启动并运行:
docker run -d -p 8050:8050 scrapinghub/splash --max-timeout 600
飞溅出现在“http://127.0.0.1:8050”和“http://localhost:8050”上。
我的主机文件如下所示:
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
127.0.0.1 feedworker.host
255.255.255.255 broadcasthost
::1 localhost
# Added by Docker Desktop
# To allow the same kube context to work on the host and the container:
127.0.0.1 kubernetes.docker.internal
# End of section
但是当我爬行做“
scrapy crawl spider_name
“我每次都明白:
2021-09-03 14:03:26 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://ted.europa.eu/TED/browse/browseByPD.do via http://localhost:8050/execute> (failed 3 times): Connection was refused by other side: 61: Connection refused.
2021-09-03 14:03:26 [scrapy.core.scraper] ERROR: Error downloading <GET https://ted.europa.eu/TED/browse/browseByPD.do via http://localhost:8050/execute>
Traceback (most recent call last):
File "/Users/sudipadh/Desktop/upwork/scrapy-rabbit/venv/lib/python3.8/site-packages/scrapy/core/downloader/middleware.py", line 44, in process_request
return (yield download_func(request=request, spider=spider))
twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 61: Connection refused.
飞溅的scrapy设置:
DOWNLOADER_MIDDLEWARES = {
'proactis.tor.middleware.TorMiddleware': 100,
'scrapy_splash.SplashCookiesMiddleware': 723,
'scrapy_splash.SplashMiddleware': 725,
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware': 810,
}
SPLASH_URL = 'http://localhost:8050/'
任何帮助,将不胜感激:
解决方案
推荐阅读
- node.js - Mongoose 不在数组的 objectid 中填充 objectid
- javascript - 拼接不取消右行vuejs
- svelte - 对苗条反应性教程的困惑
- javascript - 使用 javascript 和 jQuery 在 MySql 中将值从一个表插入到另一个表
- multipartform-data - Mule 4:HTTP 请求者:如何将 multipart/form-data 作为 Mule REST 服务调用的 POST 正文发送?
- sql - 如何在redshift Postgresql中使用字符串作为列名(字符串到列名的动态转换)
- javascript - d3-xyzoom:与 webpack 一起使用时,滚动(滚轮)缩放抛出“d3-xyzoom.js:83 Uncaught TypeError: Cannot read property 'button' of null”
- ruby - 将一个范围的每个项目分配给哈希中另一个范围的每个项目
- javascript - 单击选择/选项时的 JavaScript 新选项卡
- angular - 为什么 flushMicrotasks() 和 tick() 在 fakeAsync 中不执行 setImmediate 回调?