Gemma3 4b is very good with tool calling.
you might need to edit system prompt for it to provide you with correct format output.
I also suggest you to do tool calling in python code mode. Very easy to parse and concise output. So you can limit output tokens to like 2000 and still work wonders with gemma 3 4b.
you do realize that Model Context Protocol servers exist? claude 3.7 is a thinking model and as far as I can tell it is able to interact with user specific tools that way.
saw someone use it for reverse engineering with an IDA Pro + MCP server combo
actually from the OpenAI o3 publications, it seems the larger model has the function/tool capabilities. They likely are scared of it being used maliciously though. hence only the mini for now
likely it costs money and thinking models are new harder way of doing things. if anything, it is probably more cost effective to have a non-thinking model handle tool functions and have a tool for it to query a thinking model. it is possible adding too much tool calling into the thinking model can degrade its capabilities for thinking. its a mixed bag of unknowns.
surprisingly a lot of these LLM companies are running on bare minimum margins. GPT pro's 200/m doesnt even make money for openai most of the time.
Claude 3.7 does work, so that is something you can use if you wanted to. probably one of the better LLMs out there for tool calling, coding or other non-creative stuff.
don't know what else to say, its big enough mystery to warrant curiosity but not enough to hate on thinking models.
15 Comments
Radiant_Dog1937@reddit
Osama_Saba@reddit (OP)
0xCODEBABE@reddit
Osama_Saba@reddit (OP)
Foreign-Beginning-49@reddit
0xCODEBABE@reddit
Foreign-Beginning-49@reddit
johncarpen1@reddit
x0wl@reddit
ezjakes@reddit
Osama_Saba@reddit (OP)
Clear-Ad-9312@reddit
Osama_Saba@reddit (OP)
Clear-Ad-9312@reddit
Clear-Ad-9312@reddit