当运行 examples/quantization/basic_usage_gpt_xl.py
时,在模型打包过程中出现错误:
2023-05-22 04:08:34 INFO [auto_gptq.quantization.gptq] duration: 0.16880011558532715
2023-05-22 04:08:34 INFO [auto_gptq.quantization.gptq] avg loss: 333.67144775390625
2023-05-22 04:08:34 INFO [auto_gptq.modeling._base] Quantizing attn.c_proj in layer 48/48...
2023-05-22 04:08:34 INFO [auto_gptq.quantization.gptq] duration: 0.16850876808166504
2023-05-22 04:08:34 INFO [auto_gptq.quantization.gptq] avg loss: 18.650110244750977
2023-05-22 04:08:35 INFO [auto_gptq.modeling._base] Quantizing mlp.c_fc in layer 48/48...
2023-05-22 04:08:35 INFO [auto_gptq.quantization.gptq] duration: 0.16927051544189453
2023-05-22 04:08:35 INFO [auto_gptq.quantization.gptq] avg loss: 624.6328125
2023-05-22 04:08:35 INFO [auto_gptq.modeling._base] Quantizing mlp.c_proj in layer 48/48...
2023-05-22 04:08:36 INFO [auto_gptq.quantization.gptq] duration: 0.6983904838562012
2023-05-22 04:08:36 INFO [auto_gptq.quantization.gptq] avg loss: 1028.072509765625
2023-05-22 04:08:36 INFO [auto_gptq.modeling._utils] Packing model...
2023-05-22 04:08:36 INFO [auto_gptq.modeling._utils] transformer.h.0.attn.c_attn
Traceback (most recent call last):
File "/AutoGPTQ/examples/quantization/basic_usage_gpt_xl.py", line 93, in <module>
main()
File "/AutoGPTQ/examples/quantization/basic_usage_gpt_xl.py", line 67, in main
model.quantize(traindataset, use_triton=False)
File "/home/user/miniconda3/envs/pytorch2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/user/miniconda3/envs/pytorch2/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 350, in quantize
pack_model(
File "/home/user/miniconda3/envs/pytorch2/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 110, in pack_model
qlayers[name].pack(layers[name], scale, zero, g_idx)
File "/home/user/miniconda3/envs/pytorch2/lib/python3.10/site-packages/auto_gptq/nn_modules/qlinear_old.py", line 96, in pack
(linear.weight.data[:, idx] + scale_zeros[g_idx]) / self.scales[g_idx]
RuntimeError: The size of tensor a (1600) must match the size of tensor b (4800) at non-singleton dimension 0
将 desc_act=False
更改为 True
时,错误不会发生。请注意,即使将 desc_act=False
更改为 group_size=-1
,错误仍然会发生。AutoGPTQ 的版本是最新的 d4011d2
环境:Ubuntu 22.04,pytorch 2.0.0,CUDA 11.8,transformers 4.29.2,GPU:4090
1条答案
按热度按时间1hdlvixo1#
当我尝试使用AutoGPTQ对模型进行量化时,遇到了相同的问题。现在有什么解决方案可以解决这个问题吗?