We need some polls on many topics - 2026

Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 8 comments

I don't see polls that much this year so far. Sharing few topics here(with some options .... please feel free to add/remove for polls).

1. Coding Assistants:

Roo-code gone recently. What are you using?

Cline
Open Code
Pi
Kilo Code
Aider
Continue
Tabby
OpenHands
Zed
goose

2. Agents:

I remember that Openclaw made headlines for sometime & now it's popularity declined. What are you using?

Hermes Agent
ChatDev
Camel
CrewAI
Dify
AutoGPT
OpenHands
Browser Use
Openclaw

3. Inference engines:

llama.cpp(and its wrappers) / ik_llama.cpp
vLLM
MLX
SGLang
TensorRT
Transformers
ExLlama(V2/V3)

4. Inference:

CPU-Only (No VRAM)
CPU-Only (Only for small models)
Hybrid
GPU-Only (if model fits VRAM)
GPU-Only always

Somebody please post polls on above topics. I'm not an expert on these topics so please post polls with strong contenders. Use this thread to decide strong contenders with given options(also add any other options). Also post polls on any other topics.

[-]

Jipok_@reddit

1/2) I manually copy pieces of code from/to the chat.
3) llama.cpp and openrouter
4) gpu-only, speed is all u need

[-]

pmttyji@reddit (OP)

1/2) I manually copy pieces of code from/to the chat.

That's fine for tiny/small projects. But for big projects, we need to go with Agentic coding dude. I'm waiting for my new rig to try Agentic coding.

[-]

Jipok_@reddit

It really depends on what you define as "big."

If "big" just means navigating massive piles of boilerplate or spaghetti code, then sure, agents might help. But for complex, logically dense algorithms, the rule of thumb is: the smaller the context, the higher the quality of the output.

The biggest issue with almost every agent nowadays is severe context bloat. They aggressively stuff the window with terminal outputs, irrelevant file trees, and system prompts, which completely tanks the model's reasoning capabilities.

I can curate the necessary context manually far better than any agent's retrieval heuristic. I’ve tried Cline and Phind, but ended up going back to the old-school manual copy-paste method. It keeps the model focused, maintains high precision, and honestly saves me time in the long run.

[-]

UntimelyAlchemist@reddit

Open Code but I'm also experimenting with Pi inside VSCode dev containers.
Hermes because it was easier to set up than OpenClaw. I haven't even heard of the other ones you listed.
llama.cpp.
GPU-only.

[-]

jacek2023@reddit

I think you should spend more time on using local LLMs than posting

[-]

pmttyji@reddit (OP)

Waiting for the new rig. It got delayed again. I'll get it in 2 weeks probably.

[-]

jacek2023@reddit

you can run some models on potato

[-]

pmttyji@reddit (OP)

Unfortunately my current laptop is facing display issues, that's why I couldn't even try MTP on recent Qwen models.

So I'm posting things here from old laptop(DDR3 16GB RAM, No GPU) for now😄