scrapy 我在以前将scraper部署到Zyte时遇到问题(Scraping hub)

ygya80vv  于 2023-04-30  发布在  其他
关注(0)|答案(1)|浏览(144)

我的蜘蛛必须从输入中读取一些数据。csv文件。它在本地运行良好。但是当我尝试通过shub deploy在Zyte上部署它时,它不包括输入。CSV正在构建中。
因此,当我尝试在服务器上运行它时,它会产生以下错误。

Traceback (most recent call last):
  File "<frozen zipimport>", line 177, in get_data
KeyError: 'webscrap/resources/input.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/scrapy/core/engine.py", line 127, in _next_request
    request = next(slot.start_requests)
  File "/app/__main__.egg/webscrap/spiders/website_scraper.py", line 13, in start_requests
    zipcodes_csv = pkgutil.get_data("webscrap", "resources/input.csv")
  File "/usr/local/lib/python3.8/pkgutil.py", line 637, in get_data
    return loader.get_data(resource_name)
  File "<frozen zipimport>", line 179, in get_data
OSError: [Errno 0] : 'webscrap/resources/input.csv'

这是我的代码

zipcodes_csv = pkgutil.get_data("webscrap", "resources/input.csv")
        with io.TextIOWrapper(io.BytesIO(zipcodes_csv), encoding='utf-8') as file:
            csvreader = csv.DictReader(file)

这里是 www.example.com 文件

setup(
    name         = 'project',
    version      = '1.0',
    packages     = find_packages(),
    entry_points = {'scrapy': ['settings = webscrap.settings']},
    package_data={
        'project': ['resources/*.csv']
    },
    include_package_data=True,
)

vybvopom

vybvopom1#

通过更改www.example修复了此问题 www.example.com 文件到

setup(
name         = 'webscrap',
version      = '2.0',
packages     = find_packages(),
entry_points = {'scrapy': ['settings = webscrap.settings']},
package_data={
    'webscrap': ['resources/*.csv']
},
include_package_data=True,


解决了需求中的一些依赖性问题。txt并添加到scrapinghub中。YML文件

相关问题