I’ve been building a local autonomous coding agent that can plan, execute, validate, and fix its own code — entirely offline.
Posted by Keyboard_Lord@reddit | LocalLLaMA | View on Reddit | 5 comments
I’ve been working on a terminal-native coding agent called Rasputin.
This started as an attempt to build a “Codex at home” system, but with stronger guarantees around determinism, auditability, and recovery — closer to what I think these systems should look like in practice.
It runs fully locally (Ollama + qwen2.5-coder) and can:
- plan multi-step coding tasks
- execute changes
- run validation (build/tests)
- fix its own errors (bounded self-healing loop)
- track everything with an audit log + replay
It’s not just a chat wrapper — it runs a constrained execution loop with:
- deterministic state (ExecutionState / ExecutionOutcome)
- validation-gated commits (fail-closed)
- checkpoint + resume
- bounded retries + recovery
- completion confidence (so it doesn’t declare success too early)
I also built a benchmark harness to test it on real coding tasks.
Latest result (qwen2.5-coder:14b):
8/8 PASS, 0 partial, 0 fail
Everything runs locally — no API, no rate limits.
Repo:
https://github.com/Keyboard-Lord/Rasputin-Coder
Would love feedback — especially where you think this approach breaks or doesn’t scale.
LocalLLaMA-ModTeam@reddit
Rule 3 - Minimal value post. slop
Better-Monk8121@reddit
ai slop
Keyboard_Lord@reddit (OP)
It’s a bounded execution loop, not a chat agent.
Model → structured plan → tool execution (read/write/patch/run) → validation (build/tests) → recovery on failure → completion gate.
Key constraints:
- deterministic state (ExecutionState / ExecutionOutcome)
- validation-gated commits (fail-closed)
- bounded retries + recovery
- checkpoint + resume
- full audit log with replay
Keyboard_Lord@reddit (OP)
Structured plan → execute → validate → recover → complete.
Deterministic state, validation-gated commits, bounded retries, checkpoint/resume, full audit + replay.
Keyboard_Lord@reddit (OP)
L1: ExecutionState → live progress
L2: ExecutionOutcome → final truth
L3: AuditLog → full history
L4: Replay → reconstruction + validation
L5: Checkpoint → safe continuation