首页 > 解决方案 > 使用 qt5reactor 在 PyQt5 中创建可重启的 scrapy spider reactor

问题描述

我的 GUI 有一个“更新数据库”按钮,每次用户按下它时,我都想启动一个 Scrapy 蜘蛛,它将抓取的数据存储到 Sqlite3 数据库中。qt5reactor正如这个答案所暗示的那样,我实施了,但是现在ReactorNotRestartable当我第二次按下更新按钮时出现错误。我怎样才能解决这个问题?我尝试从CrawlerRunnerto切换CrawlerProcess,但它仍然抛出相同的错误(但也许我做错了,虽然)。我也不能使用这个答案,因为q.get()锁定了事件循环,所以当我运行蜘蛛时 GUI 冻结。我是多处理的新手,如果我遗漏了一些非常明显的东西,我很抱歉。

在 main.py

... # PyQt5 imports
import qt5reactor
from scrapy import crawler
from twisted.internet import reactor
from currency_scraper.currency_scraper.spiders.investor import InvestorSpider

class MyGUI(QMainWindow):

    def __init__(self):
        self.update_db_button.clicked.connect(self.on_clicked_update)
        ...

    def on_clicked_update(self):
        """Gives command to run scraper and fetch data from the website"""
        runner = crawler.CrawlerRunner(
            {
                "USER_AGENT": "currency scraper",
                "SCRAPY_SETTINGS_MODULE": "currency_scraper.currency_scraper.settings",
                "ITEM_PIPELINES": {
                    "currency_scraper.currency_scraper.pipelines.Sqlite3Pipeline": 300,
                }
            }
        )
        deferred = runner.crawl(InvestorSpider)
        deferred.addBoth(lambda _: reactor.stop())
        reactor.run() # has to be run here or the crawling doesn't start
        update_notification()

    ... # other stuff

if __name__ == "__main__":
   open_window()
   qt5reactor.install()
   reactor.run()

错误日志:

Traceback (most recent call last):
  File "c:/Users/Familia/Documents/ProgramaþÒo/Python/Projetos/Currency_converter/main.py", line 330, in on_clicked_update
    reactor.run()
  File "c:\Users\Familia\Documents\ProgramaþÒo\Python\Projetos\Currency_converter\venv\lib\site-packages\twisted\internet\base.py", line 1282, in run
    self.startRunning(installSignalHandlers=installSignalHandlers)
  File "c:\Users\Familia\Documents\ProgramaþÒo\Python\Projetos\Currency_converter\venv\lib\site-packages\twisted\internet\base.py", line 1262, in startRunning
    ReactorBase.startRunning(self)
  File "c:\Users\Familia\Documents\ProgramaþÒo\Python\Projetos\Currency_converter\venv\lib\site-packages\twisted\internet\base.py", line 765, in startRunning    
    raise error.ReactorNotRestartable()
twisted.internet.error.ReactorNotRestartable

标签: pythonscrapymultiprocessingpyqt5twisted

解决方案


推荐阅读