ollama 我无法使视觉模型正常工作,

ht4b089n  于 2个月前  发布在  其他
关注(0)|答案(7)|浏览(41)

问题是什么?

我通过docker运行ollama,一切都运行得很顺利,但是视觉模型有问题。
我尝试了llavabakllava,但都没有成功。

你期望看到什么?

我提供的镜像的描述。

重现步骤

使用docker运行ollama示例,拉取llava或bakllava的最新模型。
进行查询测试,就像在https://github.com/ollama/ollama/blob/main/docs/api.md#request-with-images中那样。答案并不如预期,总是随机的,例如:

{
    "model": "llava",
    "created_at": "2024-03-24T05:02:22.859351985Z",
    "response": " The image shows a person sitting at a table with some papers or documents. The focus is on the person's face, which appears to be in deep thought or concentration. There are no other discernable objects or details in the picture. ",
    "done": true,
    "context": [...

尝试了llava和bakllava,其他每个模型似乎都能正常工作。我尝试了最高质量和简单内容图像。

是否最近有更改引入了这个问题?

  • 无响应*

OS

Linux

架构

arm64

平台

Docker

Ollama版本

ollama版本是0.1.28

GPU

  • 无响应*

GPU信息

  • 无响应*

CPU

  • 无响应*

其他软件

  • 无响应*
u5rb5r59

u5rb5r591#

这可能是#3298的重复问题。我在我的Mac上本地运行llava时遇到了相同的问题。

0s7z1bwu

0s7z1bwu2#

这可能是重复的问题。我在本地的Mac上运行llava时遇到了相同的问题。
我看到了这个问题,但我打开了这个,因为它似乎是一个不同的问题:
首先,我正在通过REST API进行实验,其次,似乎有一个下采样的问题。
无论如何,谢谢你链接那个问题,它们可能相关。

ujv3wf0j

ujv3wf0j3#

我无法复现这个问题。使用链接中的例子,这是我得到的结果:

$ curl http://localhost:11434/api/generate -d '{
  "model": "llava",
  "prompt":"What is in this picture?",
  "stream": false,
  "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
}'
{"model":"llava","created_at":"2024-03-27T22:43:39.689889Z","response":" The image shows an animated character that appears to be a stylized, cartoon-like creature. It's drawn in black and white with simple lines, giving it a cute and whimsical appearance. This character is often associated with a certain type of internet meme known as \"doge memes.\" ","done":true,"context":[733,16289,28793,1824,349,297,456,5754,28804,733,28748,16289,28793,415,3469,4370,396,25693,3233,369,8045,298,347,264,341,2951,1332,28725,7548,4973,28733,4091,15287,28723,661,28742,28713,10421,297,2687,304,3075,395,3588,4715,28725,5239,378,264,17949,304,388,7805,745,9293,28723,851,3233,349,2608,5363,395,264,2552,1212,302,7865,1626,28706,2651,390,345,17693,28706,1626,274,611,28705],"total_duration":2748008333,"load_duration":646098000,"prompt_eval_count":1,"prompt_eval_duration":1060742000,"eval_count":66,"eval_duration":1040527000}

这里是参考的图片,从base64输入解码得到:

虽然它不是完美的,但它与LLaVA demo一致,后者使用了更大的模型(34b vs. mistral 7b):

r6hnlfcb

r6hnlfcb4#

@mxyng ,对我来说,最简单的复现这个问题的方法是给Lava提供一个包含大量文本的大型文档。很明显,部署在网络上的演示版本能够读取和解释图像中的文本。而Lava的Ollama版本则要么对文档内容撒谎,要么只能读取页面上最大的标题,然后声称无法阅读更详细的内容,因为文本“模糊”。这表明它无法解释页面上的文本。

示例图片:这个LLaVa 1.6摘要

示例结果来自 https://llava.hliu.cc/

示例结果来自Ollama

igsr9ssn

igsr9ssn5#

我可以在这里复现这个:

来源是:

tmb3ates

tmb3ates6#

有人成功解决了这个问题吗?据说Ollama 0.1.34版本解决了这个问题,但对我来说并非如此。

kuhbmx9i

kuhbmx9i7#

我使用ollama 0.1.38和llava-llama-3-8b-v1.1设置了这个。我按照模型卡上的说明操作,并使用了int4模型。

https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf

$ ollama --version
ollama version is 0.1.38

我将上面的llama图像请求 curl ,这是响应:

In the image, there's a playful scene featuring a cartoon cat. The cat, drawn in black and white, is sitting on its hind legs with its front paws raised as if it's about to perform or execute something. Its eyes are closed, suggesting a sense of concentration or anticipation.\n\nThe cat's head is slightly tilted to the left, adding to its expressive posture. It has a small nose and ears, typical characteristics of many cartoon cats...

我也收到了针对每张图片的通用和随机回复(通常描述咖啡和咖啡馆),但我的问题是我忘记在我的模型文件中包含mmproj文件。

FROM ./models/llava-llama-3-8b-v1_1-int4.gguf
FROM ./models/llava-llama-3-8b-v1_1-mmproj-f16.gguf

现在llava正在为我处理ollama,两个文件都已到位。

相关问题