Me and, seemingly, a few people using this repo have been encountering a problem of the models being loaded into the CPU memory after inactivity/sleep mode/etc. Would there be a possibility to have a flag that would cause the ollama to fail if it cannot load the model onto the GPU and try again?
1条答案
按热度按时间7eumitmz1#
可能在更新GPU驱动程序期间,我已提交了一个错误报告。在更新Nvidia GPU驱动程序后,Ollama停止使用GPU加速,直到您重新启动Ollama服务或计算机。