ollama num_ctx参数在Linux上不起作用

klsxnrf1 于 4个月前发布在 Linux

关注(0)|答案(2)|浏览(126)

问题：尽管日志显示n_ctx=102400,但输入提示符仍然截断为2048。
Ollama版本：0.2.1
操作系统：AWS Linux,示例类型：g5.xlarge
GPU:Nvidia A10 24G GPU,版本12.x
模型：Qwen2-7B-Instruct,GGUF V3
Ollama服务器日志：

Device 0: NVIDIA A10G, compute capability 8.6, VMM: yes
llm_load_tensors: ggml ctx size =    0.30 MiB
llm_load_tensors: offloading 28 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 29/29 layers to GPU
llm_load_tensors:  CUDA_Host buffer size =   292.36 MiB
llm_load_tensors:      CUDA0 buffer size =  3928.07 MiB
**llama_new_context_with_model: n_ctx      = 102400**
llama_new_context_with_model: n_batch    = 512
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: flash_attn = 1
llama_new_context_with_model: freq_base  = 1000000.0
llama_new_context_with_model: freq_scale = 1

上面的日志显示n_ctx=102400,但我仍然收到以下输入截断警告，n_ctx的值为2048。
LLM请求日志：

[GIN] 2024/07/13 - 02:27:54 | 200 | 17.512894039s |       127.0.0.1 | POST     "/api/chat"
INFO [update_slots] input truncated | **n_ctx=2048** n_erase=1440 n_keep=4 n_left=2044 n_shift=1022 tid="140646121488384" timestamp=1720837794
INFO [update_slots] input truncated | **n_ctx=2048** n_erase=1432 n_keep=4 n_left=2044 n_shift=1022 tid="140646121488384" timestamp=1720837794
INFO [update_slots] input truncated | **n_ctx=2048** n_erase=1441 n_keep=4 n_left=2044 n_shift=1022 tid="140646121488384" timestamp=1720837795

这个相同的模型+相同的Ollama版本在我Mac上运行良好，但是当我将其移动到AWS Linux时，出现了错误。

操作系统

Linux

GPU

Nvidia

CPU

Intel

Ollama版本

0.2.1

ollama

来源：https://github.com/ollama/ollama/issues/5661

2条答案

按热度按时间

bq3bfh9z1#

你好，你有OLLAMA_NUM_PARALLEL设置吗？

赞(0）回复(0）举报 4个月前

beq87vna2#

是的，这是我的脚本

export OLLAMA_HOST=0.0.0.0 
export OLLAMA_KEEP_ALIVE=600m
export OLLAMA_MAX_LOADED_MODELS=5
export OLLAMA_NUM_PARALLEL=50
export GIN_MODE=release
export OLLAMA_MAX_QUEUE=1024
export OLLAMA_ORIGINS="*"
ollama serve

赞(0）回复(0）举报 4个月前

我来回答

ollama num_ctx参数在Linux上不起作用

操作系统

GPU

CPU

Ollama版本

2条答案

相关问题

热门标签

最新问答