mlc-llm [Bug] 在Qwen2-1.5B-Instruct上进行量化后,大型模型的性能精度严重下降,请修复它,

qjp7pelc  于 2个月前  发布在  其他
关注(0)|答案(6)|浏览(40)

🐛 Bug

重现步骤

  1. 重现该行为:
  2. 预期行为:
  3. 环境:
  • 平台(例如WebGPU/Vulkan/IOS/Android/CUDA):
  • 操作系统(例如Ubuntu/Windows/MacOS/...):
  • 设备(例如iPhone 12 Pro,PC+RTX 3090,...)
  1. 如何安装MLC-LLM(conda,源代码):
  2. 如何安装TVM-Unity(pip,源代码):
  3. Python版本(例如3.10):
  4. GPU驱动程序版本(如果适用):
  5. CUDA/cuDNN版本(如果适用):
  6. TVM Unity哈希标签(python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))",如果您编译模型则适用):
  7. 其他相关信息:
x4shl7ld

x4shl7ld1#

感谢您打开这个问题。如果您不介意描述或分享更多关于您运行的信息以及遇到的什么现象,那将会更有帮助。

sshcrbum

sshcrbum2#

模型已编译如下:

mlc_llm convert_weight ./dist/models/Qwen2-1.5B/ --model-type qwen2 --quantization q4f16_1 -o dist/Qwen2-1.5B-q4f16_1-MLC
mlc_llm gen_config ./dist/models/Qwen2-1.5B/ --model-type qwen2 --quantization q4f16_1 --conv-template chatml --context-window-size 2048 --max-batch-size 1 -o dist/Qwen2-1.5B-q4f16_1-MLC/

我的 mlc-package-config.json 如下:

{
 "device": "android",
 "model_list": [
 {
 "model": "/home/stephen/mlc-llm/dist/Qwen2-1.5B-Instruct-q4f16_1-MLC",
 "model_id": "Qwen2-1.5B-Instruct-q4f16_1-MLC",
 "estimated_vram_bytes": 3980990464,
 "bundle_weight": true
 }
 ]
}

然后使用 mlc_llm package 构建 libtvm4j_runtime_packed.so 和 tvm4j_core.jar。接着在 Android Studio 中打开 android/MLCChat 以构建 APK。最后使用 python bundle_weight.py --apk-path app/release/app-release.apk 将 APK 安装到 Snapdragon 8 Gen 1 芯片的 meta50 上。结果显示为

,但将相同的 APK 安装到 Kirin90000S 芯片 Maleoon 910 GPU 上时,显示 OK,只是速度变慢了:

cnh2zyt3

cnh2zyt33#

2024-06-12 16:43:11.778 10890-10890 HwRemoteIn...hodManager ai.mlc.mlcchat W isCasting false because IHwDistributedWindowManager is invalid
2024-06-12 16:43:11.788 10890-10890 DecorView ai.mlc.mlcchat I navBarColor: fffcfcfc statusBarColor: ff00668b statusInsets: Insets{left=0, top=91, right=0, bottom=0} navInsets: Insets{left=0, top=0, right=0, bottom=0}
2024-06-12 16:43:11.788 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:1 size: 0 showView:false color:fffcfcfc
2024-06-12 16:43:11.788 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:0 size: 91 showView:true color:ff00668b
2024-06-12 16:43:11.800 10890-10890 HwViewRootImpl ai.mlc.mlcchat I removeInvalidNode all the node in jank list is out of time
2024-06-12 16:43:12.002 10890-10890 DecorView ai.mlc.mlcchat I navBarColor: fffcfcfc statusBarColor: ff00668b statusInsets: Insets{left=0, top=91, right=0, bottom=0} navInsets: Insets{left=0, top=0, right=0, bottom=0}
2024-06-12 16:43:12.002 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:1 size: 0 showView:false color:fffcfcfc
2024-06-12 16:43:12.002 10890-10890 DecorView ai.mlc.mlcchat I updateColorViewInt type:0 size: 91 showView:true color:ff00668b
2024-06-12 16:43:15.568 10893-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] start to get views' rect, type = SCENE_GESTURE_SINGLE_TAP
2024-06-12 16:43:15.575 10893-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] windowModeType: 1
2024-06-12 16:43:15.575 10893-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] displayPoint: Point(977, 555)
2024-06-12 16:43:15.575 10893-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] windowModeType: 1
2024-06-12 16:43:15.575 10893-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] lazyMode:
2024-06-12 16:43:15.575 10893-10933 HwViewRootImpl ai.mlc.mlcchat D [DetectViewsLocationRunner] current mode is full screen
2024-

5q4ezhmt

5q4ezhmt4#

@jeethu 发现任何问题了吗?

wdebmtf2

wdebmtf25#

有任何发现的问题吗?

puruo6ea

puruo6ea6#

有任何发现的问题吗?

相关问题