Converted and quantized to Q6_K using convert-hf-to-gguf.py and quantize from llama.cpp b2690 - but when trying to run using ollama run (0.1.32) I get:
Error: llama runner process no longer running: 3221226505
The same GGUF file can be launched using llama.cpp.
2条答案
按热度按时间gfttwv5a1#
肯定会对其进行量化,但目前只有4k的上下文。他们将在以后发布一个新版本。
如果你在HF上看到它并且错过了,请在这里给我发个消息
jutyujz02#
Converted and quantized to Q6_K using convert-hf-to-gguf.py and quantize from llama.cpp b2690 - but when trying to run using ollama run (0.1.32) I get:
The same GGUF file can be launched using llama.cpp.