Why might MTP be net negative for tool heavy agentic flows?

Posted by Substantial_Step_351@reddit | LocalLLaMA | View on Reddit | 2 comments

The Qwen3.6-27B MTP benchmarks that have been circulating put factual tasks at 62-70% acceptance vs code at 79-89%. Tool calls probably sit in that factual range or lower, structured output, constrained format, less predictable than pure code generation. For agents doing dense tool calling sequences, the PP overhead per prefill pass might consistently eat the TG benefit. Not obvious MTP is net positive there tbh.

Anyone actually running it on agentic pipelines seeing a different result?