vllm 无法从源代码构建ROCm(在使用PyTorch和Xformers时工作正常)

nwsw7zdq 于 2个月前发布在其他

关注(0)|答案(8)|浏览(43)

操作系统：Linux 6.6.17-1-lts
硬件：AMD 4650G (Renoir), gfx90c
软件：torch==2.3.0.dev20240224+rocm5.7, xformers==0.0.23(两者均已确认可用)。
问题描述：按照ROCm从源代码构建的安装指南进行操作后：

Total number of replaced kernel launches: 21
running install
/home/toto/tmp/testenv/lib/python3.11/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/home/toto/tmp/testenv/lib/python3.11/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running bdist_egg
running egg_info
writing vllm.egg-info/PKG-INFO
writing dependency_links to vllm.egg-info/dependency_links.txt
writing requirements to vllm.egg-info/requires.txt
writing top-level names to vllm.egg-info/top_level.txt
reading manifest file 'vllm.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'vllm.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'vllm._C' extension
Emitting ninja build file /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
g++ -shared -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/activation_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/hip_utils_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/layernorm_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/moe_align_block_size_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/pos_encoding_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/pybind.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/quantization/gptq/q_gemm.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/quantization/squeezellm/quant_hip_kernel.o -L/home/toto/tmp/testenv/lib/python3.11/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -L/usr/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-311/vllm/_C.cpython-311-x86_64-linux-gnu.so
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__float2bfloat16(float)':
cache_kernels.hip:(.text+0x0): multiple definition of `__float2bfloat16(float)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x0): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__bfloat1622float2(__hip_bfloat162)':
cache_kernels.hip:(.text+0x40): multiple definition of `__bfloat1622float2(__hip_bfloat162)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x40): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__double2bfloat16(double)':
cache_kernels.hip:(.text+0x60): multiple definition of `__double2bfloat16(double)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x60): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__float22bfloat162_rn(HIP_vector_type<float, 2u>)':
cache_kernels.hip:(.text+0xa0): multiple definition of `__float22bfloat162_rn(HIP_vector_type<float, 2u>)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0xa0): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__high2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x110): multiple definition of `__high2float(__hip_bfloat162)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x110): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__low2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x120): multiple definition of `__low2float(__hip_bfloat162)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x120): first defined here
collect2: error: ld returned 1 exit status
error: command '/usr/bin/g++' failed with exit code 1

vllm

来源：https://github.com/vllm-project/vllm/issues/3067

8条答案

按热度按时间

vddsk6oq1#

我遇到了在两个平台上从源代码构建vllm时相同的问题：
第一个平台：

vllm标签v0.3.2
rocm 6.0.2
PyTorch 2.1.2+git98a6632
xformers 0.0.23

第二个平台：

vllm标签v0.3.2
rocm 5.7.0
PyTorch 2.0.1+git4c8bc42
xformers 0.0.23

赞(0）回复(0）举报 2个月前

nfs0ujit2#

我在同一平台上从源代码构建vllm时遇到了相同的问题：
首先：

vllm标签v0.3.2
rocm 6.0.2
PyTorch 2.1.2+git98a6632
xformers 0.0.23

其次：

vllm标签v0.3.2
rocm 5.7.0
PyTorch 2.0.1+git4c8bc42
xformers 0.0.23

g++ -pthread -B /opt/conda/envs/py_3.10/compiler_compat -shared -Wl,-rpath,/opt/conda/envs/py_3.10/lib -Wl,-rpath-link,/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib -Wl,-rpath,/opt/conda/envs/py_3.10/lib -Wl,-rpath-link,/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/activation_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/hip_utils_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/layernorm_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/moe_align_block_size_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/pos_encoding_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/pybind.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/quantization/gptq/q_gemm.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/quantization/squeezellm/quant_hip_kernel.o -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-310/vllm/_C.cpython-310-x86_64-linux-gnu.so
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__float2bfloat16(float)':
cache_kernels.hip:(.text+0x0): multiple definition of `__float2bfloat16(float)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x0): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__bfloat1622float2(__hip_bfloat162)':
cache_kernels.hip:(.text+0x40): multiple definition of `__bfloat1622float2(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x40): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__double2bfloat16(double)':
cache_kernels.hip:(.text+0x60): multiple definition of `__double2bfloat16(double)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x60): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__float22bfloat162_rn(HIP_vector_type<float, 2u>)':
cache_kernels.hip:(.text+0xa0): multiple definition of `__float22bfloat162_rn(HIP_vector_type<float, 2u>)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0xa0): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__high2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x110): multiple definition of `__high2float(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x110): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__low2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x120): multiple definition of `__low2float(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x120): first defined here
collect2: error: ld returned 1 exit status
error: command '/usr/bin/g++' failed with exit code 1

现在，我可以在第一个平台上(rocm 6.0.2)通过在/opt/rocm/include/hip/amd_detail/amd_hip_bf16.h的两行末尾添加static来从源代码构建vllm,就像在ROCm/clr@77c581a中一样。参考：#2646(评论)
然而，后来我又遇到了另一个问题，就像在#3061中一样。

赞(0）回复(0）举报 2个月前

zzzyeukh3#

我遇到了在两个平台上从源代码构建vllm时相同的问题：
第一个平台：

vllm标签v0.3.2
rocm 6.0.2
PyTorch 2.1.2+git98a6632
xformers 0.0.23

第二个平台：

vllm标签v0.3.2
rocm 5.7.0
PyTorch 2.0.1+git4c8bc42
xformers 0.0.23

然而，后来我又遇到了另一个问题，就像#3061中提到的那样。我会尝试解决这个问题，你的系统设置是什么？你的系统也是AMD iGPU吗？

赞(0）回复(0）举报 2个月前

kognpnkq4#

我遇到了在两个平台上从源代码构建vllm时相同的问题：
第一个平台：

vllm标签v0.3.2
rocm 6.0.2
PyTorch 2.1.2+git98a6632
xformers 0.0.23

第二个平台：

vllm标签v0.3.2
rocm 5.7.0
PyTorch 2.0.1+git4c8bc42
xformers 0.0.23

然而，后来我又遇到了另一个问题，就像#3061中提到的那样。我会尝试解决这个问题，你的系统设置是什么？你的显卡也是AMD iGPU吗？
我没有使用AMD集成显卡。它们是MI210和MI300X。

赞(0）回复(0）举报 2个月前

ff29svar5#

由于编译器错误，需要通过添加static来修复头文件。
--- amd_hip_bf16.h 2024-02-06 18:28:58.268699142 +0000
+++ amd_hip_bf16.h.new 2024-02-06 18:28:31.988647133 +0000
@@ -90,10 +90,10 @@
#include "math_fwd.h" // ocml device functions

if defined(HIPCC_RTC)

-#define HOST_DEVICE****device
+#define HOST_DEVICEdevicestatic
#else
#include
-#define HOST_DEVICEhostdevice
+#define HOST_DEVICEhostdevice****static inline
#endif