TheaterFire

server, webui: support continue generation on reasoning models by ServeurpersoCom · Pull Request #22727 · ggml-org/llama.cpp

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 3 comments

now you can CONTINUE

Reply to Post

3 Comments

Chromix_@reddit

Finally, efficient parallel bulk generation with large input data (especially when paired with -kvu). If the context limit hits - just store the temporary result, retry later when more is free, instead of throwing it all away.
View on Reddit #85903748

rerri@reddit

Can you also edit text within the thinking block? At some point this was not possible for some reason.
View on Reddit #85903598

LegacyRemaster@reddit

very good news!
View on Reddit #85902333