Ollama API image payload format for python

Posted by Ok-Internal9317@reddit | LocalLLaMA | View on Reddit | 4 comments

Hi guys, is this the correct python payload format for ollama? { "role": "user", "content": "what is in this image?", "images": ["iVBORw0KQuS..."] #base64 } I am asking because for both openrouter and ollama running the same gemma12b passed the same input and image encodings, openrouter returned sense and ollama seemed to have no clue about the image it's describing. Ollama documentation says this is right, but myself tested for a while and I couldn't get the same result from oenrouter and ollama. My goal is to making a python image to llm to text parser. Thanks for helping!