text-generation-inference 本地安装失败

wa7juj8i  于 23天前  发布在  其他
关注(0)|答案(7)|浏览(16)

系统信息

参考 https://github.com/huggingface/text-generation-inference?tab=readme-ov-file#local-install 安装TGI本地。但我一直收到与vllm相关的错误。

RuntimeError: Cannot find CMake executable
make[1]: *** [Makefile-vllm:5: build-vllm-cuda] Error 1
make[1]: Leaving directory '/home/gpu/path/llm/text-generation-inference/server'
make: *** [Makefile:2: install-server] Error 2

信息

  • Docker
  • 直接使用CLI

任务

  • 一个官方支持的命令
  • 我自己的修改

复现

git clone https://github.com/huggingface/text-generation-inference.git
cd text-generation-inference/
BUILD_EXTENSIONS=True make install

预期行为

完成构建

6rqinv9w

6rqinv9w1#

你好,@ragesh2000!
感谢您报告此问题!您能提供更多关于您的系统的信息吗?
看起来您可能没有安装cmake?

3z6pesqy

3z6pesqy2#

@ErikKaum I have cmake in my system with version cmake version 3.22.1
Here is the complete traceback

-- Enabling C extension.
-- Enabling moe extension.
-- Configuring done
-- Generating done
-- Build files have been written to: /home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311
gmake[2]: Entering directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[3]: Entering directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[4]: Entering directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[5]: Entering directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[5]: Leaving directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[5]: Entering directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
[ 33%] Building CXX object CMakeFiles/_moe_C.dir/csrc/moe/moe_ops.cpp.o
[ 66%] Building CUDA object CMakeFiles/_moe_C.dir/csrc/moe/topk_softmax_kernels.cu.o
/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
  435 |         function(_Functor&& __f)
      |                                                                                                                                                 ^ 
/usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
/usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
  530 |         operator=(_Functor&& __f)
      |                                                                                                                                                  ^ 
/usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
gmake[5]: *** [CMakeFiles/_moe_C.dir/build.make:90: CMakeFiles/_moe_C.dir/csrc/moe/topk_softmax_kernels.cu.o] Error 1
gmake[5]: Leaving directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[4]: *** [CMakeFiles/Makefile2:112: CMakeFiles/_moe_C.dir/all] Error 2
gmake[4]: Leaving directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[3]: *** [CMakeFiles/Makefile2:119: CMakeFiles/_moe_C.dir/rule] Error 2
gmake[3]: Leaving directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
gmake[2]: *** [Makefile:182: _moe_C] Error 2
gmake[2]: Leaving directory '/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/build/temp.linux-x86_64-cpython-311'
Traceback (most recent call last):
  File "/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/setup.py", line 383, in <module>
    setup(
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/__init__.py", line 108, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 184, in setup
    return run_commands(dist)
           ^^^^^^^^^^^^^^^^^^
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
    dist.run_commands()
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 970, in run_commands
    self.run_command(cmd)
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/dist.py", line 956, in run_command
    super().run_command(command)
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
    cmd_obj.run()
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
    self.distribution.run_command(command)
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/dist.py", line 956, in run_command
    super().run_command(command)
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 989, in run_command
    cmd_obj.run()
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 93, in run
    _build_ext.run(self)
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
    self.build_extensions()
  File "/home/gpu/ai/llm/quantise/text-generation-inference/server/vllm/setup.py", line 188, in build_extensions
    subprocess.check_call(['cmake', *build_args], cwd=self.build_temp)
  File "/home/gpu/miniconda3/envs/tgi/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', '_moe_C', '-j', '64']' returned non-zero exit status 2.
make[1]: *** [Makefile-vllm:5: build-vllm-cuda] Error 1
make[1]: Leaving directory '/home/gpu/ai/llm/quantise/text-generation-inference/server'
make: *** [Makefile:2: install-server] Error 2
z4iuyo4d

z4iuyo4d3#

好的,明白了。感谢提供的完整堆栈跟踪信息 👍
所以你得到了一个 subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', '_moe_C', '-j', '64']' returned non-zero exit status 2.
我认为 Unix 返回状态 2 表示无效的 shell 命令。
一种更好的理解方式是在终端中直接运行该命令,以了解为什么它是无效的。

uhry853o

uhry853o4#

cmake --build . --target _moe_C -j 64的结果是
Error: could not load cache
@ErikKaum

bfhwhh0e

bfhwhh0e5#

看起来仍然像是cmake错误:https://stackoverflow.com/questions/16319292/cmake-error-could-not-load-cache
顺便问一下,你是否需要从源代码构建TGI?一般来说,如果你想使用docker化的版本运行它,更容易上手。

ymzxtsji

ymzxtsji6#

实际上,我正在使用GPU docker运行整个程序。因此,我认为在docker内部运行另一个docker可能会产生一些冲突。@ErikKaum

e4yzc0pl

e4yzc0pl7#

是的,这肯定不会让事情变得更简单!

相关问题