unstructured-ingest s3命令导致Fsspec.Downloader.download_config.download_dir为None,

nxagd54h  于 2个月前  发布在  其他
关注(0)|答案(1)|浏览(31)

运行命令:

unstructured-ingest \
   s3 \
   --remote-url s3://anticythera/\
   --anonymous \
   --output-dir /Users/anticythera/PycharmProjects/scientificProject/data/ \
   --num-processes 2

导致错误:

ERROR: /Users/anticythera/.cache/unstructured/ingest/pipeline/index/8485948ff856.json: [download]
unsupported operand type(s) for /: 'NoneType' and 'PosixPath'

由于 Fsspec.Downloader.download_config.download_dir 为 None
我正在运行 Mac OS 14.5
堆栈跟踪:

2024-05-26 06:55:10,617 MainProcess INFO     Calling DownloadStep with 1 docs
INFO: Calling DownloadStep with 1 docs
2024-05-26 06:55:10,617 MainProcess INFO     processing content async
INFO: processing content async
2024-05-26 06:55:10,619 MainProcess ERROR    Exception raised while running download
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.11/site-packages/unstructured/ingest/v2/pipeline/interfaces.py", line 97, in run_async
    return await self._run_async(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/unstructured/ingest/v2/pipeline/steps/download.py", line 84, in _run_async
    download_path = self.process.get_download_path(file_data=file_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/unstructured/ingest/v2/processes/connectors/fsspec/fsspec.py", line 240, in get_download_path
    self.download_config.download_dir / Path(file_data.source_identifiers.rel_path)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for /: 'NoneType' and 'PosixPath'
ERROR: Exception raised while running download
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.11/site-packages/unstructured/ingest/v2/pipeline/interfaces.py", line 97, in run_async
    return await self._run_async(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/unstructured/ingest/v2/pipeline/steps/download.py", line 84, in _run_async
    download_path = self.process.get_download_path(file_data=file_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/unstructured/ingest/v2/processes/connectors/fsspec/fsspec.py", line 240, in get_download_path
    self.download_config.download_dir / Path(file_data.source_identifiers.rel_path)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for /: 'NoneType' and 'PosixPath'
2024-05-26 06:55:10,622 MainProcess ERROR    1 failed documents:
ERROR: 1 failed documents:
2024-05-26 06:55:10,622 MainProcess ERROR    /Users/anticythera/.cache/unstructured/ingest/pipeline/index/8485948ff856.json: [download] unsupported operand type(s) for /: 'NoneType' and 'PosixPath'
ERROR: /Users/anticythera/.cache/unstructured/ingest/pipeline/index/8485948ff856.json: [download] unsupported operand type(s) for /: 'NoneType' and 'PosixPath'
mlmc2os5

mlmc2os51#

谢谢@tuvalusoftware -我们会尽快查看这个问题。

相关问题