Numpy v2.0.0破坏了使用spaCy下载模型的能力

k2fxgqgv  于 5个月前  发布在  其他
关注(0)|答案(7)|浏览(85)

如何复现行为
在我的Dockerfile中,我运行了以下命令:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install spacy

RUN python -m spacy download en_core_web_lg

它返回了以下错误(和堆栈跟踪):

2.519 Traceback (most recent call last):
2.519   File "<frozen runpy>", line 189, in _run_module_as_main
2.519   File "<frozen runpy>", line 148, in _get_module_details
2.519   File "<frozen runpy>", line 112, in _get_module_details
2.519   File "/usr/local/lib/python3.12/site-packages/spacy/__init__.py", line 6, in <module>
2.521     from .errors import setup_default_warnings
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/errors.py", line 3, in <module>
2.522     from .compat import Literal
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/compat.py", line 39, in <module>
2.522     from thinc.api import Optimizer  # noqa: F401
2.522     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/api.py", line 1, in <module>
2.522     from .backends import (
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/__init__.py", line 17, in <module>
2.522     from .cupy_ops import CupyOps
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/cupy_ops.py", line 16, in <module>
2.522     from .numpy_ops import NumpyOps
2.522   File "thinc/backends/numpy_ops.pyx", line 1, in init thinc.backends.numpy_ops
2.524 ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

锁定到numpy的先前版本将解决此问题:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install numpy==1.26.4 spacy

RUN python -m spacy download en_core_web_lg
ma8fv8wu

ma8fv8wu2#

如何复现行为
在我的Dockerfile中,我运行了以下命令:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install spacy

RUN python -m spacy download en_core_web_lg

它返回了以下错误(和堆栈跟踪):

2.519 Traceback (most recent call last):
2.519   File "<frozen runpy>", line 189, in _run_module_as_main
2.519   File "<frozen runpy>", line 148, in _get_module_details
2.519   File "<frozen runpy>", line 112, in _get_module_details
2.519   File "/usr/local/lib/python3.12/site-packages/spacy/__init__.py", line 6, in <module>
2.521     from .errors import setup_default_warnings
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/errors.py", line 3, in <module>
2.522     from .compat import Literal
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/compat.py", line 39, in <module>
2.522     from thinc.api import Optimizer  # noqa: F401
2.522     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/api.py", line 1, in <module>
2.522     from .backends import (
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/__init__.py", line 17, in <module>
2.522     from .cupy_ops import CupyOps
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/cupy_ops.py", line 16, in <module>
2.522     from .numpy_ops import NumpyOps
2.522   File "thinc/backends/numpy_ops.pyx", line 1, in init thinc.backends.numpy_ops
2.524 ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

锁定到numpy的先前版本将解决此问题:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install numpy==1.26.4 spacy

RUN python -m spacy download en_core_web_lg

这个解决方案有帮助,谢谢

9rygscc1

9rygscc13#

+1 I also had this problem. Thanks for posting the solution 👍

qfe3c7zg

qfe3c7zg4#

Those solutions indeed works, but I would still like to see a fix in the codebase itself. This issue is that inside the requirements.txt of the project (just an assumption after a short look at the codebase), the version is specified as such:

numpy>=1.15.0; python_version < "3.9"
numpy>=1.19.0; python_version >= "3.9"

I am a huge fan, in all of my projects, of always pinning dependencies even up to the patch version.
I would suggest a PR that looks like this:

numpy>=1.15.0,<2.0.0; python_version < "3.9"
numpy>=1.19.0,<2.0.0; python_version >= "3.9"

This at least pins the version down to major releases, which should anyway always be the case, as major version can (and most likely will always) contain breaking changes.

tvz2xvvm

tvz2xvvm5#

@DoctorManhattan123 为了澄清,我发布的解决方案仅作为临时措施。
理想情况下,所有使用numpy的下游用户(包括库维护者)都应该完成向numpy 2.0.0的迁移。考虑到发布的大小,我认为这需要时间。
固定版本是为了帮助那些寻求快速修复他们的CI/CD或其他受影响过程的人,直到在受影响的代码库中实施更稳健的解决方案。

6kkfgxo0

6kkfgxo06#

关于tinc的问题已经被注意到了。

w1jd8yoj

w1jd8yoj7#

这很有帮助。谢谢!

相关问题