ollama 为ViP-LLaVA添加支持?

gdx19jrr  于 2个月前  发布在  其他
关注(0)|答案(1)|浏览(26)

Hi OLLaMa team,
ViP-LLaVA is a region-level large multimodal model from LLaVA team that is capable of understanding visual prompts such as scribbles, bounding boxes, arrows, etc.
There are only several lines of changes to the original LLaVA code. Huggingface already integrate ViP-LLaVA into the official transformers library. https://huggingface.co/docs/transformers/main/model_doc/vipllava
Consider adding ViP-LLaVA here?
Thank you!
Mu Cai

相关问题