AdditionalWeb107

I built a Claude Code Router TUI

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 2 comments
Signals – finding the most informative agent traces without LLM judges (arxiv.org)

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 0 comments
Plano reaches 5K GH stars as I continue to help devs build agents locally

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 2 comments
Plano 0.4.3 ⭐️ Filter Chains via MCP and OpenRouter Integration

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 0 comments
Just launched Plano v0.4 - a unified data plane supporting polyglot AI development

Posted by AdditionalWeb107@reddit | Python | View on Reddit | 1 comments
Is it one big agent, or sub-agents?

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 8 comments
I built Plano(A3B): most efficient LLMs for agent orchestration that exceed frontier model perf

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 35 comments
archgw 0.3.20 - gutted out 500Mbs worth of python dependenices in the req path.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 0 comments
archgw 0.3.20 - 500MBs of python dependencies gutted out - faster, leaner proxy server for agents.

Posted by AdditionalWeb107@reddit | Python | View on Reddit | 0 comments
From OSS to $1M in contract value from one customer. How? Forward deployed engineers.

Posted by AdditionalWeb107@reddit | ExperiencedDevs | View on Reddit | 7 comments
Arch-Router: The first (and fastest) LLM router that can align to your usage preferences.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 25 comments
🚀 HuggingFaceChat Omni: Dynamic policy-baed routing to 115+ LLMs

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 5 comments
Preference-aware routing to local LLMs for Claude Code 2.0

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 0 comments
Claude Code 2.0 Router - Access Ollama-based LLMs and align automatic routing to preferences, not benchmarks.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 10 comments
ArchGW 🚀 - Use Ollama-based LLMs with Anthropic client (release 0.3.13)

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 4 comments
Examining the 72988 character long Claude Code Prompt

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 1 comments
ArchGW 0.3.12 🚀 Model aliases: allow clients to use friendly, semantic names and swap out underlying models without changing application code.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 0 comments
real engineering work in AI gets paid. We went from a demo to a $500k in total contract value building "networking" for agents.

Posted by AdditionalWeb107@reddit | ExperiencedDevs | View on Reddit | 19 comments
The outer loop vs. the inner loop of agents. A simple mental model to evolve the stack quickly

Posted by AdditionalWeb107@reddit | Python | View on Reddit | 7 comments
The outer loop vs. the inner loop of agents. A mental model to evolve the agent stack quickly and push to production faster.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 6 comments
Detecting Hallucinations in LLM Function Calling with Entropy (Part 2)

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 3 comments
GPT-5 Style Router, but for any LLM including local.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 64 comments
My team has to stop this "let me grab this AI framework" mentality and think about overall system design.

Posted by AdditionalWeb107@reddit | ExperiencedDevs | View on Reddit | 17 comments
Connect 3rd party SaaS tools to your agentic apps - ArchGW 0.2.1 🚀 adds support for bearer authorization for upstream APIs for function calling scenarios.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 4 comments
[Research] Align LLM routing to task preferences, not benchmarks - with a fast 1.5B model

Posted by AdditionalWeb107@reddit | ExperiencedDevs | View on Reddit | 2 comments
Handle follow-up or clarifying questions in RAG scenarios (with ease)

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 3 comments
Strategies for handling transient Server-Sent Events (SSE) from LLM responses

Posted by AdditionalWeb107@reddit | ExperiencedDevs | View on Reddit | 12 comments
Strategies for handling transient Server-Sent Events (SSE) from LLM responses

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 3 comments
Finally an LLM router that thinks like an engineer

Posted by AdditionalWeb107@reddit | programming | View on Reddit | 9 comments
Vibe coding RouteGPT - a chrome extension aligns model routing to my preferences, powered by a small but powerful LLM.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 0 comments
Finally, an LLM Router That Thinks Like an Engineer - And Its Local

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 3 comments
An alternative to semantic or benchmark-based routing: A preference-aligned router model

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 6 comments
Arch-Agent Family of LLMs - Designed for fast, multi-step agent orchestration.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 7 comments
From Arch-Function to Arch-Agent. Designed for fast multi-step, multi-turn workflow orchestration in agents.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 16 comments
Semantic routing and caching doesn't work - task specific LLMs (TLMs) ftw!

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 9 comments
Core infrastructure patterns implemented in AI coding frameworks - will come home to roost

Posted by AdditionalWeb107@reddit | ExperiencedDevs | View on Reddit | 28 comments
ArchGW 0.2.8 is out 🚀 - unifying repeated "low-level" functionality in building LLM apps via a local proxy.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 16 comments
How to load a 4-bit quantized 1.5B parameter LLM in the browser?

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 8 comments
Using a local runtime to run models for an open source project vs. HF transformers library

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 2 comments
I think triage agents should run "out-of-process". Here's why.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 16 comments
Why are people rushing to programming frameworks for agents?

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 17 comments
I built an open-source AI-native proxy for LLM agents

Posted by AdditionalWeb107@reddit | programming | View on Reddit | 4 comments
Arch. The AI-native proxy server that handles the low-level application logic for agents

Posted by AdditionalWeb107@reddit | Python | View on Reddit | 0 comments
Arch-Function-Chat Trending #1 on HuggingFace!

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 10 comments
When vibe coding no longer vibes back

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 75 comments
Not GPT-4, but a 3B Function Calling LLM that can chat to clarify tools calls

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 13 comments
Arch-Function-Chat (1B/3B/7B) - Device friendly, family of fast LLMs for function calling scenarios now trained to chat.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 7 comments
Who is building MCP servers - and how are you thinking about exposure risks?

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 13 comments
How I adapted a 1B function calling LLM for fast routing and agent hand -off scenarios in a framework agnostic way.

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 11 comments
I built a small (function calling) LLM that packs a big punch; integrated in an open source gateway for agentic apps

Posted by AdditionalWeb107@reddit | LocalLLaMA | View on Reddit | 75 comments