Ollama, Why No Reka Flash, SmolLM3, GLM-4?

Posted by chibop1@reddit | LocalLLaMA | View on Reddit | 20 comments

I don't expect Ollama to have every finetuned models on their main library, and I understand that you can import gguf models from hugging face.

Still, it seems pretty odd that they're missing Reka Flash-3.2, SmolLM3, GLM-4. I believe other platforms like LMStudio, MLX, unsloth, etc have them.

[-]

AppearanceHeavy6724@reddit

I still cannot get why would anyone still use ollama if you can run llama.cpp directly, shrug.

[-]

klop2031@reddit

Tbh i use ollama because its easy to use. Llama.cpp isnt that much harder but alas, ollama is just easier to deal with.

[-]

chibop1@reddit (OP)

If llama.cpp works for you, keep using it shrug.

[-]

colin_colout@reddit

If llama.cpp works for you, keep using it shrug.

Have you tried other llama.cpp wrappers / forks? I haven't used them much myself, but I hear good things about llama-swap and koboldai

My experience with ollama lines up with what you're saying, and why I moved to llama.cpp directly, but there are other replacement options I hear (maybe others can chime in with what htey use)

[-]

Then what the point of your diatribe about ollama not having latest models? it is what it is and they do not owe you latest models in their repos. You chose inferior product - dela with it or ask their support forums why they are the way they are.

[-]

Maximum@reddit

ollama run model_name has much less friction if I want to try out something, then downloading it and running a command line and making sure that all params are correct, plus switching the models back and forth

[-]

hayTGotMhYXkm95q5HW9@reddit

Ollama "just works" especially when using open web ui.

[-]

FORLLM@reddit

I first installed it for bolt diy, after that I found it worked nicely with other programs and started building my own frontend around it. It works well and plays well with others, I'm sure it's not the only backend that does, but I've had zero problems with it. Not sure why I'd keep looking for something else until it doesn't meet my needs.

[-]

AppearanceHeavy6724@reddit

With ollama you are missing features that come with llama.cpp and also at mercy of ollama devs which models you can use.

[-]

FORLLM@reddit

So far I have what features I need and I can install any model on huggingface.

I'm not opposed to other software, but I've seen a weird backlash against ollama where some people seem upset that others are using it. I'm not trying to convince anyone else to use it, just not sure why people seem to want me not to.

[-]

FullstackSensei@reddit

Because ollama is the other software. They wrap llama.cpp while making a lot of the flexibility and power of it obsecure or hard to use. They also do some shady things like giving false names to models.

[-]

AppearanceHeavy6724@reddit

well in this case continue using it, but OP clearly has problems with ollama. Backlash against ollama is in perceived lack of value over the foundation it is built on.

[-]

throwawayacc201711@reddit

I don’t mind running cli commands and configuring things but my main entry point I use for working with my models is through openwebui. The nice thing I found with ollama is that it will dynamically use my VRAM and system RAM if I’m selecting larger models than my GPU and offers some decent optimizations. Is there a no touch solution that llama.cpp offers for that?

Example: go to openwebui and select a model and llama.cpp can dynamically figure out how to off load layers?

This has been my main reason I haven’t made the switch and I’ve been preoccupied with other projects that are r/selfhosted

[-]

AppearanceHeavy6724@reddit

llama.cpp can dynamically figure out how to off load layers?

Of course. basic functionality.

[-]

jacek2023@reddit

What's so awesome in ollama?

[-]

chibop1@reddit (OP)

CONVENIENCE!!! Nothing more.

[-]

Save a Modelfile:

FROM /path/to/your/GLM-4-Q4_K_M.gguf

PARAMETER temperature 0.8

PARAMETER num_ctx 4096

SYSTEM You are a helpful assistant with expertise in AI and programming.

Add it with:

ollama create my-custom-model -f Modelfile

Run it:
ollama run my-custom-model