Apscheduler+scrapy信号仅在主线程中有效

sr4lhrrt 于 2022-11-09 发布在其他

关注(0)|答案(1)|浏览(184)

我想把apscheduler和scrapy结合起来。但是我的代码是错误的。我该怎么修改它呢？

settings = get_project_settings()
configure_logging(settings)
runner = CrawlerRunner(settings)

@defer.inlineCallbacks
def crawl():
    reactor.run()
    yield runner.crawl(Jobaispider)#this is my spider
    yield runner.crawl(Jobpythonspider)#this is my spider
    reactor.stop()

sched = BlockingScheduler()
sched.add_job(crawl, 'date', run_date=datetime(2018, 12, 4, 10, 45, 10))
sched.start()

错误：内建。ValueError：信号仅在主线程中有效

scrapy

来源：https://stackoverflow.com/questions/53605039/apschedulerscrapy-signal-only-works-in-main-thread

1条答案

按热度按时间

c3frrgcw1#

这个问题已经在这里得到了很详细的回答：How to integrate Flask & Scrapy?，其中涵盖了各种用例和想法。我还发现该线程中的一个链接非常有用：https://github.com/notoriousno/scrapy-flask
为了更直接地回答你的问题，试试这个。它使用了上面两个链接中的解决方案，特别是它使用了钩针库。

import crochet
crochet.setup()

settings = get_project_settings()
configure_logging(settings)
runner = CrawlerRunner(settings)

# Note: Removing defer here for the example

# @defer.inlineCallbacks

@crochet.run_in_reactor
def crawl():
    runner.crawl(Jobaispider)#this is my spider
    runner.crawl(Jobpythonspider)#this is my spider

sched = BlockingScheduler()
sched.add_job(crawl, 'date', run_date=datetime(2018, 12, 4, 10, 45, 10))
sched.start()

赞(0）回复(0）举报 2022-11-09

我来回答

Apscheduler+scrapy信号仅在主线程中有效

1条答案

相关问题

热门标签

最新问答