Lemonade OmniRouter: unifying the best local AI engines for omni-modality

Posted by jfowers_amd@reddit | LocalLLaMA | View on Reddit | 28 comments

I’ve always liked how if I ask ChatGPT to make or edit an image, it just does it. Local AI should be this convenient! One install, one endpoint. Ask for an image of a cat and it appears. Ask for a hat on the cat, with a narrated story. Now we can easily build immersive experiences.

Lemonade's OmniRouter brings that same pattern to local through built-in tools:

Your workflow talks to Lemonade running on your own NPU/GPU through OpenAI-compatible tool calling.

How it works:

  1. Lemonade sets up all these local AI engines for your system.
  2. Add Lemonade’s tool definitions to your workflows.
  3. When your LLM triggers a tool call it gets routed to the corresponding engine (sd.cpp, whisper.cpp, kokoros).
  4. Feed the result back into your loop.

That’s it. No custom orchestration layer, no new abstractions to learn. Check it out in this 181-line e2e Python example.

We’ve added support for OmniRouter in our reference web ui (also available as a Tauri app), which is what you’re seeing in the video. But I’m much more excited to see what people build on top.

I know my next project is going to be some kind of TTRPG-style adventure game. It’s already surprisingly fun to ask OmniRouter to be a dungeon master who illustrates and narrates the story, and I think it can be enhanced quite a bit if I build an app/harness around it.

If you find this interesting, please drop us a star and say hi! * GitHub: https://github.com/lemonade-sdk/lemonade * Discord: https://discord.gg/5xXzkMu8Zk