Looking for software that processes images in realtime (or periodically).
Posted by My_Unbiased_Opinion@reddit | LocalLLaMA | View on Reddit | 5 comments
Are there any projects out there that allow a multimodal llm process a window in realtime? Basically im trying to have the gui look at a window, take a screenshot periodically and send it to ollama and have it processed with a system and spit out an output all hands free.
Ive been trying to look at some OSS projects but havent seen anything (or else I am not looking correctly).
Thanks for yall help.
5 Comments
vasileer@reddit
My_Unbiased_Opinion@reddit (OP)
throwawayacc201711@reddit
SM8085@reddit
Calcidiol@reddit