vllm [用法]:如何使用离线推断为Mistral 7B传递JSON内容类型？

q7solyqu 于 2个月前发布在其他

关注(0)|答案(1)|浏览(24)

当前环境

The output of `python collect_env.py`

您希望如何使用vllm

我希望在使用 JSON mode 进行离线推理时，同时使用 generate 方法。这是可能的吗？仅使用提示似乎无法生成请求的JSON输出。如果不可能，是否唯一的解决方案是使用类似 Outlines 的东西？如果有详细信息，请告知。

llm = LLM(model="mistralai/Mistral-7B-v0.3")
prompt = "Please name the biggest and smallest continent in JSON using the following schema: {biggest: <the biggest continent's name>, smallest: <the smallest continent>}"
sampling_params = SamplingParams(temperature=temperature, top_p=1.0)
response = self.llm.generate(prompt, sampling_params)

vllm

来源：https://github.com/vllm-project/vllm/issues/7030