How to parse Tool calls in llama.cpp?

Posted by sZebby@reddit | LocalLLaMA | View on Reddit | 10 comments

Most of my code is similar to agent-cpp from Mozilla. I create common_chat_templates_inputs Inputs from message history.

auto params = common_chat_templates_apply(templs_, inputs);

...tokenize and Generation works fine but when I try to parse tool calls with:

common_chat_parser_params p_params= common_chat_parser_params(params);

common_msg msg = common_chat_parse(response, false, p_params)

there are no tool_calls in the msg and it adds the assistant Generation prompt to the content.

so msg.content looks like this:

<|im_start...........

I expected that tool calls would be populated.

currently using granite-4.0-h-micro-Q4_K_S and the latest llama.cpp.

is my way of generating wrong? or any suggestions would be highly appreciated. thanks :)