Small open-source models can behave like real agents if the runtime owns the protocol

Posted by -eth3rnit3-@reddit | LocalLLaMA | View on Reddit | 5 comments

Small open-source models can behave like real agents if the runtime owns the protocol

I’ve been working on a Ruby project called Kernai.

It’s technically an agent runtime, but I’m not trying to make the 100th “agent harness”. The thing I wanted to explore was a bit different:

what happens if the runtime owns the execution protocol, instead of depending on provider-native tool calling, framework abstractions, or huge prompt-injected tool registries?

The core idea is very small:

What I find interesting is that this makes agent behavior much more portable across models.

Even models with no native tool calling can still work in this setup.
And in my tests, even small open-source models can handle surprisingly complex scenarios if the execution contract is clear enough. They usually take more steps and are less reliable than bigger models, but they still work as agents.

Another thing I think matters a lot: the agent context stays very light.

A lot of current agent systems inject huge tool definitions, MCP registries, schemas, etc directly into the prompt. That works, but it also bloats context and mixes everything together from the start.

With this approach, the runtime stays much more exploratory:

So instead of dumping every skill and every MCP tool into context upfront, the agent explores capabilities progressively:

That keeps the prompt lighter, makes the execution model cleaner, and in practice seems to help even smaller models.

I also wanted to keep the whole thing very minimal:

There are a bunch of tested scenarios in the repo, including:

What made it feel real to me is that I’ve already built a personal shell on top of it, and I’m now integrating the same approach into an existing commercial product where agents interact with the app at different levels.

So this isn’t really me trying to launch a shiny new AI framework.

It’s more me sharing an approach that feels simpler, lighter, and more robust than most of what I’ve tried in this space.

Repo if anyone wants to take a look: https://github.com/Eth3rnit3/kernai

Curious what people think, especially if you also feel that a lot of current agent stacks are getting too heavy.