以前,当我在VSCode终端运行此命令时,我没有得到任何错误。
scrapy crawl ma -a start_at=1 -a end_and=2 -a quick_crawl=false
但是现在,我不知道为什么它会得到这个错误
2022-07-20 10:10:14 [log.log_scrapy_info] INFO : Scrapy 2.2.1 started (bot: regulation_crawler)
2022-07-20 10:10:14 [log.log_scrapy_info] INFO : Versions: lxml 4.9.1.0, libxml2 2.9.14, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.4.0, Python 3.8.10 (default, Jun 22 2022, 20:18:18) - [GCC 9.4.0], pyOpenSSL 22.0.0 (OpenSSL 3.0.5 5 Jul 2022), cryptography 37.0.4, Platform Linux-5.15.0-41-generic-x86_64-with-glibc2.29
2022-07-20 10:10:14 [log.log_scrapy_info] DEBUG : Using reactor: twisted.internet.epollreactor.EPollReactor
2022-07-20 10:10:14 [crawler.__init__] INFO : Overridden settings:
{'AUTOTHROTTLE_DEBUG': True,
'AUTOTHROTTLE_ENABLED': True,
'BOT_NAME': 'regulation_crawler',
'DOWNLOAD_DELAY': 0.5,
'FEED_EXPORT_INDENT': 4,
'LOG_FILE': '',
'LOG_FORMAT': '%(asctime)s [%(module)s.%(funcName)s] %(levelname)s : '
'%(message)s',
'LOG_LEVEL': 10,
'NEWSPIDER_MODULE': 'regulation_crawler.crawler.spiders',
'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['regulation_crawler.crawler.spiders'],
'USER_AGENT': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) '
'Gecko/20100101 Firefox/84.0'}
2022-07-20 10:10:14 [telnet.__init__] INFO : Telnet Password: 42678d057d4fa701
2022-07-20 10:10:14 [warnings._showwarnmsg] WARNING : /home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/extensions/feedexport.py:210: ScrapyDeprecationWarning: The `FEED_URI` and `FEED_FORMAT` settings have been deprecated in favor of the `FEEDS` setting. Please see the `FEEDS` setting docs for more details
exporter = cls(crawler)
2022-07-20 10:10:14 [middleware.from_settings] INFO : Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.feedexport.FeedExporter',
'scrapy.extensions.logstats.LogStats',
'scrapy.extensions.throttle.AutoThrottle',
'regulation_crawler.crawler.extensions.statsd.CrawlerStatsdExporterExtension']
2022-07-20 10:10:14 [__init__._load_handler] ERROR : Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "http"
Traceback (most recent call last):
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/__init__.py", line 49, in _load_handler
dhcls = load_object(path)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/utils/misc.py", line 50, in load_object
mod = import_module(module)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http.py", line 2, in <module>
from scrapy.core.downloader.handlers.http11 import (
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 24, in <module>
from scrapy.core.downloader.webclient import _parse
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/webclient.py", line 4, in <module>
from twisted.web.client import HTTPClientFactory
ImportError: cannot import name 'HTTPClientFactory' from 'twisted.web.client' (unknown location)
2022-07-20 10:10:14 [__init__._load_handler] ERROR : Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "https"
Traceback (most recent call last):
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/__init__.py", line 49, in _load_handler
dhcls = load_object(path)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/utils/misc.py", line 50, in load_object
mod = import_module(module)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http.py", line 2, in <module>
from scrapy.core.downloader.handlers.http11 import (
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 24, in <module>
from scrapy.core.downloader.webclient import _parse
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/webclient.py", line 4, in <module>
from twisted.web.client import HTTPClientFactory
ImportError: cannot import name 'HTTPClientFactory' from 'twisted.web.client' (unknown location)
2022-07-20 10:10:14 [__init__._load_handler] ERROR : Loading "scrapy.core.downloader.handlers.s3.S3DownloadHandler" for scheme "s3"
Traceback (most recent call last):
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/__init__.py", line 49, in _load_handler
dhcls = load_object(path)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/utils/misc.py", line 50, in load_object
mod = import_module(module)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/s3.py", line 3, in <module>
from scrapy.core.downloader.handlers.http import HTTPDownloadHandler
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http.py", line 2, in <module>
from scrapy.core.downloader.handlers.http11 import (
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 24, in <module>
from scrapy.core.downloader.webclient import _parse
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/webclient.py", line 4, in <module>
from twisted.web.client import HTTPClientFactory
ImportError: cannot import name 'HTTPClientFactory' from 'twisted.web.client' (unknown location)
Unhandled error in Deferred:
2022-07-20 10:10:14 [_legacy.publishToNewObserver] CRITICAL : Unhandled error in Deferred:
Traceback (most recent call last):
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/crawler.py", line 192, in crawl
return self._crawl(crawler, *args,**kwargs)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/crawler.py", line 196, in _crawl
d = crawler.crawl(*args,**kwargs)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/twisted/internet/defer.py", line 1905, in unwindGenerator
return _cancellableInlineCallbacks(gen)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/twisted/internet/defer.py", line 1815, in _cancellableInlineCallbacks
_inlineCallbacks(None, gen, status)
--- <exception caught here> ---
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/twisted/internet/defer.py", line 1660, in _inlineCallbacks
result = current_context.run(gen.send, result)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/crawler.py", line 87, in crawl
self.engine = self._create_engine()
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/crawler.py", line 101, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/engine.py", line 69, in __init__
self.downloader = downloader_cls(crawler)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/__init__.py", line 83, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/utils/misc.py", line 50, in load_object
mod = import_module(module)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/downloadermiddlewares/retry.py", line 28, in <module>
from scrapy.core.downloader.handlers.http11 import TunnelError
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 24, in <module>
from scrapy.core.downloader.webclient import _parse
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/webclient.py", line 4, in <module>
from twisted.web.client import HTTPClientFactory
builtins.ImportError: cannot import name 'HTTPClientFactory' from 'twisted.web.client' (unknown location)
2022-07-20 10:10:14 [_legacy.publishToNewObserver] CRITICAL :
Traceback (most recent call last):
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/twisted/internet/defer.py", line 1660, in _inlineCallbacks
result = current_context.run(gen.send, result)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/crawler.py", line 87, in crawl
self.engine = self._create_engine()
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/crawler.py", line 101, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/engine.py", line 69, in __init__
self.downloader = downloader_cls(crawler)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/__init__.py", line 83, in __init__
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/utils/misc.py", line 50, in load_object
mod = import_module(module)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/downloadermiddlewares/retry.py", line 28, in <module>
from scrapy.core.downloader.handlers.http11 import TunnelError
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 24, in <module>
from scrapy.core.downloader.webclient import _parse
File "/home/fhadli/.cache/pypoetry/virtualenvs/regulation-crawler-_1wB9V_j-py3.8/lib/python3.8/site-packages/scrapy/core/downloader/webclient.py", line 4, in <module>
from twisted.web.client import HTTPClientFactory
ImportError: cannot import name 'HTTPClientFactory' from 'twisted.web.client' (unknown location)
我尝试过但不起作用的方法:
- 删除并重新创建virtualenv
- 我已经查找了其他类似的问题,但他们都是旧版本的scrapy,而现在我用的是较新版本的scrapy。
零碎版本:2.2.1
1条答案
按热度按时间0vvn1miw1#
将扭曲版本从
22.4.0
更改为21.7.0
解决了该问题。