Llama-3.2-1B-Instruct-q4f16_1-MLC vs qwen3.5:0.8b suggestions.

Posted by zenith-czr@reddit | LocalLLaMA | View on Reddit | 3 comments

I am using Llama as of now for a local meal planner and nutritionist as per a diet goal from a list of 14 diet protocols and a DB of 400 deeply researched groceries and processed foods. It's meant to be used on the go regardless of internet or not on my phone. It works great for 7-8 questions but then it gets very jittery and the entire phone lags (also becomes warm). Then i clear conversation and refresh to start again. Very rarely does it fall in a iteration loop because I have given proper context to it in the code. Just wanted to understand if the newer qwen would be better at this and more efficient? if that's the word to use.

[-]

metroshake@reddit

Try Gemma e2b or qwen 3.5, much faster and way better

zenith-czr@reddit (OP)

Thanks for the suggestion!

kompania@reddit

Bang, I've hit another cockroach bot with my slipper. You Chinese have a looooong way to learn how to promote your LLM models. Currently, your bots are instantly detectable, like this one.