Llama-3.2-1B-Instruct-q4f16_1-MLC vs qwen3.5:0.8b suggestions.

Posted by zenith-czr@reddit | LocalLLaMA | View on Reddit | 3 comments

I am using Llama as of now for a local meal planner and nutritionist as per a diet goal from a list of 14 diet protocols and a DB of 400 deeply researched groceries and processed foods. It's meant to be used on the go regardless of internet or not on my phone. It works great for 7-8 questions but then it gets very jittery and the entire phone lags (also becomes warm). Then i clear conversation and refresh to start again. Very rarely does it fall in a iteration loop because I have given proper context to it in the code. Just wanted to understand if the newer qwen would be better at this and more efficient? if that's the word to use.