TheaterFire

I don't want thinking models

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 15 comments

They don't use tools

Reply to Post

15 Comments

Radiant_Dog1937@reddit

Use a tool calling model to relay the message to the thinking model if it's needed.
View on Reddit #53453254

Osama_Saba@reddit (OP)

I don't need the thinking. Just make something like Gemma 3b, but good with tool calling please. Thanks
View on Reddit #53453300

0xCODEBABE@reddit

i don't like screwdrivers. they can't hammer nails
View on Reddit #53453601

Osama_Saba@reddit (OP)

That's a valid point, it's your choice to build your house out of nails, but I prefer screws....... Safer and more better
View on Reddit #53455465

Foreign-Beginning-49@reddit

It's strength we are after tho. Screws have no sheer force resistance.  
View on Reddit #53460486

0xCODEBABE@reddit

yes they do?
View on Reddit #53521745

Foreign-Beginning-49@reddit

Nails generally have better shear force resistance than screws. I should have kept my text shut. Cheers.
View on Reddit #53524296

johncarpen1@reddit

Gemma3 4b is very good with tool calling.  you might need to edit system prompt for it to provide you with correct format output. I also suggest you to do tool calling in python code mode. Very easy to parse and concise output. So you can limit output tokens to like 2000 and still work wonders with gemma 3 4b.
View on Reddit #53467272

x0wl@reddit

Phi4-mini-instruct? Also Qwen3 soon and if OpenAI releases a non-thinking small model it will be like this too.
View on Reddit #53459368

ezjakes@reddit

Why can't thinking models use tools?
View on Reddit #53462690

Osama_Saba@reddit (OP)

Idk
View on Reddit #53478078

Clear-Ad-9312@reddit

you do realize that Model Context Protocol servers exist? claude 3.7 is a thinking model and as far as I can tell it is able to interact with user specific tools that way. saw someone use it for reverse engineering with an IDA Pro + MCP server combo
View on Reddit #53455459

Osama_Saba@reddit (OP)

Interesting.. then why don't they have function calling in the API of the thinking models in places?
View on Reddit #53455527

Clear-Ad-9312@reddit

actually from the OpenAI o3 publications, it seems the larger model has the function/tool capabilities. They likely are scared of it being used maliciously though. hence only the mini for now
View on Reddit #53458951

Clear-Ad-9312@reddit

likely it costs money and thinking models are new harder way of doing things. if anything, it is probably more cost effective to have a non-thinking model handle tool functions and have a tool for it to query a thinking model. it is possible adding too much tool calling into the thinking model can degrade its capabilities for thinking. its a mixed bag of unknowns. surprisingly a lot of these LLM companies are running on bare minimum margins. GPT pro's 200/m doesnt even make money for openai most of the time. Claude 3.7 does work, so that is something you can use if you wanted to. probably one of the better LLMs out there for tool calling, coding or other non-creative stuff. don't know what else to say, its big enough mystery to warrant curiosity but not enough to hate on thinking models.
View on Reddit #53457741