StepFun 3.5 MTP by pwilkin · Pull Request #23274 · ggml-org/llama.cpp
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 11 comments
so we have StepFun MTP before Gemma MTP :)
XccesSv2@reddit
I love the llama.cpp devs. Does this also work for 3.7?
ilintar@reddit
Should, I'll be testing it.
XccesSv2@reddit
Yep i downloaded the Q8 from here https://huggingface.co/notSnix/Step-3.7-Flash-Q4_K_M-MTP-GGUF and used it with unsloth Quants via --model-draft Step-3.7-Flash-MTP-Q8_0.gguf --spec-type draft-mtp --spec-draft-n-max 2 --spec-draft-p-min 0.60
So yes, it works, it feels very nice
ilintar@reddit
Sweet!
jacek2023@reddit (OP)
I hope it will work with both 3.5 and 3.7 because I prefered 3.5 in my local tests
dampflokfreund@reddit
Has Gemma 4 support been added?
jacek2023@reddit (OP)
link is in the post
mr_zerolith@reddit
Nice, any idea of what the additional ram requirement is?
pmttyji@reddit
Nice.
Dumb question : Does this requires new GGUF or existing one is fine to play with? Qwen3.6 models required new GGUFs so asked this question.
Then-Topic8766@reddit
It seams so. I build now and tried with StepFun gguf. It throws an error 'llama_init_from_model: context type MTP requested but model doesn't contain MTP layers'
Due_Net_3342@reddit
from what i get from the discussion this is more like a light version of the MTP implementation, with a proper one coming from stepfun?