How many of you have seriously started using AI agents in your workplace or day to day life?
Posted by last_llm_standing@reddit | LocalLLaMA | View on Reddit | 186 comments
What agents do you use and how has it impacted your work?
Curious how people in different industries are adopting AI agents, and to what scale.
If you build your own agents from scratch, feel free to dorp your techstack or bare metal pipeline!
Free_Change5638@reddit
I build AI agent products full time — previously did vector search engines and RAG systems at a database company, now shipping my own stuff. Since OP asked for tech stacks, here's what I actually run day to day, no fluff.
My entire development workflow runs through Claude Code with Opus 4.6. Not Cursor, not Copilot. Terminal-native, agentic, 1M token context. I point it at a repo, describe what I want, and it plans, edits across 20+ files, runs tests, fixes failures, and opens a PR. I used to spend maybe 60% of my day writing code. Now I spend 60% of my day reviewing code that an agent wrote. The shift happened faster than I expected and it's not going back.
But here's where it gets interesting for this sub specifically — I don't trust cloud agents with everything. My daily workflow agent runs locally. The architecture is dead simple and I'll tell you exactly why: a while loop, an LLM call, a tool router, and a handful of native tools (bash, file I/O, browser, database). That's it. No LangChain, no CrewAI, no framework. I tried all of them and ripped every single one out within weeks. They add abstraction where you need transparency. When your agent breaks at step 14 of a 20-step task, you need to see exactly what happened — not dig through three layers of somebody else's middleware.
The thesis I've arrived at after building this for 2 years: an agent doesn't need much. Give it a shell, a database, and a browser — it can already do 90% of what these bloated tool ecosystems promise. MCP is useful as a standard protocol, sure, but conceptually the whole "plugin marketplace" thing is solving a problem Unix solved decades ago. The real AI-native OS is just Linux with an LLM making function calls. Once I internalized that, my architecture went from 15 dependencies to basically 3.
What actually changed my daily life: I have an agent that watches my GitHub notifications, Hacker News threads relevant to my space, and competitor repos — synthesizes everything into a morning briefing I read with coffee. Another one that takes my rough markdown notes after a user interview and restructures them into product specs with prioritized action items. A third that monitors my production logs, detects anomaly patterns, and drafts incident reports before I even know something's wrong. None of these required anything fancy to build. The hard part was never the agent framework — it was defining the task boundary tightly enough that the LLM doesn't hallucinate its way into uselessness.
The thing I'd push back on in this thread: people keep asking "what agents do you use" as if agents are products you install. The agents that actually stick in your workflow are the ones you build yourself, for your specific context, with your specific data. A generic "AI assistant" will always feel like a toy. An agent that knows your codebase, your docs, your naming conventions, your deployment pipeline — that's when it stops feeling like AI and starts feeling like a very fast junior employee who never sleeps.
Running locally matters more than people think, by the way. Not just for privacy — for latency. An agent that bounces through 3 API calls and takes 12 seconds per action is dead on arrival for any interactive workflow. Local inference on a decent GPU gives you sub-second tool calls. That's the difference between an agent you actually use and one you demo once then forget about.
Basting_Rootwalla@reddit
Heya. I was wondering if you could point me to and resources for going full local and open source, particularly with as minimal dependencies as possible (what you're describing.)
Any time I try to research, I'm inundated with basically the JavaScript framework equivalent for LLM stuff. 10000 different tools, frameworks, services, etc... where no one really gets what's going underneath it all.
Prime example would be how does one set up their own agent locally with just llama-cpp-turboquant and a quantized model? Let's just sat I'm reluctantly switching my study to LLMs because I need to start looking for work again in the high level business and web software world. (I've spent the past 6 months with embedded systems, EE, and low level systems but I gotta shelf that for now even though it's where I really want to push career trajectory.)
Maybe it'll all come to fruition and we'll finally make much, much smaller hyper focused LLMs or verticals that then open up the hardware world. Idk what use case there would be for say an LLM embedded into a microcontroller based application, but I'm sure we'll eventually get to more "AI" of some sort integration into specialized systems like tiny ML.
Opening-Contest-1500@reddit
Honestly, AI agents are becoming more useful than basic chatbots now. A lot of teams are using them for workflow automation, research, scheduling, customer support, and even internal operations.
One thing that stands out in 2026 is how AI agents are gradually becoming part of regular workflows instead of just experimental tools.
rentprompts@reddit
The important signal in How many of you have seriously started using AI agents in your workplace or day to day life? is the operating constraint, not the headline. Those numbers -- 3x -- matter because they change who can actually run it.
For creators or agencies, I would test it with one repeat task: same prompt, 10 runs, track output quality, failure rate, and cost per usable result. That is where hype becomes a buying decision.
Safe-Breakfast14@reddit
We’ve started using a couple internally (ticket triage + report drafting). Nothing crazy but they save a decent amount of time. The bigger challenge has actually been controlling what data they can access ended up looking into stuff like NeuralTrust for that layer.
AssignmentDull5197@reddit
We use lightweight agents for internal workflows: meeting notes -> action items, ticket routing, and db queries with a read-only tool. Biggest impact is time saved, not magic autonomy. If youre curious about patterns/stacks, https://medium.com/conversational-ai-weekly has practical writeups.
HopePupal@reddit
we have Cursor and Copilot at work but some of my coworkers are morons so i don't know if that's contributing to anything other than KPIs, really tedious reviews, and my manager's sense of thinking he can still code (he can't). i don't believe in 10× engineers but -3× engineers are real and AI makes them stupid faster. they can pull the code gacha handle all day and still not understand what they're doing. it's going to fuck us eventually when someone realizes the majority of our post-AI tests cover cases that don't occur outside tests. i thank god every day that i no longer work in safety-of-life-critical software.
a few of the more senior engineers are also all in on agents but they're not really shipping any faster as far as i can tell. they're not sending me total trash to review either so idk it's fine.
i use them for throwaway utilities, really easy fixes, medium-complex refactorings i can't do with IntelliJ's deterministic refactoring commands, and those rare bugs where i can write a straightforward set of regression tests. they'd probably be more useful if we had better UI automation tests. LLMs are also weirdly good for rubber ducking: if you can explain a plan to an agent in enough detail to give the thing a prayer of finishing, you can explain the plan to anyone.
at home i also use them for throwaways, easy fixes, medium refactorings, and rubber ducking. also webshit, but i already said "throwaways", so i repeat myself. except at home it's local models (Opencode in Alpine VMs, calling Minimax, experimenting with Qwen 3.5 27B) and a few bucks a month of Jetbrains Junie (which is currently slightly drunk Claude in a trenchcoat).
xienze@reddit
I’m worried about the day, I dunno, 20 years from now when an entire generation of developers only knows “ask the LLM to do it.” You’re already starting to see reports about Gen Z folks having boomer-tier understanding of computing because all they’ve ever known is tapping on iPads. It’s gonna be like that but worse.
hockey-throwawayy@reddit
In 20 years, finding someone who knows how to close a tag will be like finding a Cobol dev today!
creaturefeature16@reddit
Sweet. Cobol engineers get paid BANK
vineavip@reddit
thIs is so relatable, I took on too many projects at work and delegated python library development to a coworker who's a senior dev. As team's python "expert" I provided him with architectural direction, even wrote it down as copilot instructions and committed that to the repo.
What's happening now my suggested patterns are indeed implemented but in the most backward way. I try to be empathetic during code reviews but it's becoming a mess, context window will soon be too small for this style of coding. What could be 500 lines of tight code is 4000 now. I know I'll have to clean it up myself at some point as there are use cases on the roadmap that will be too clunky with current APIs.
The add insult to the injury, management requests random features and I set up a design precisely to account for that but he generates more than he understands to get tickets done as quickly as possible. He also doesn't push back on lack of time for proper testing because he can generate unit tests and the only quality gate is coverage.
EnergyNational@reddit
And you can garentee those unit tests are useless lol
Thunderstarer@reddit
This. I use LLMs in exactly the way you do and I have several -3x engineers on my team. I think it's probably a net-negative for us, but I do find them convenient on the occasion.
our_sole@reddit
Lol. After seeing so much "omg 10x!" , I laughed at -3x. When i was still coding, I think I was a 1.5x. At least i was positive..
_bones__@reddit
Really good comment.
It's certainly helped me in a few cases where we had obvious mistakes in code. It quickly pointed out that you shouldn't recreate an AsynClient for every request in python. I had it create a benchmark, and it was 50x faster to reuse the client. It was completely won't about why, though, and kept confidently going "you're right, it's not, great catch! It's actually because of "
I think it shines in doing proof of concepts. It will write those much more completely than I would, given time constraints, and if they're self contained they're easy to tweak. When adopting the PoC, you can copy/paste what you need and throw the thing away.
Parking-Ad3046@reddit
I run a mix of local and cloud agents. Local I use Ollama with some fine tuned models for text summarization and internal documentation. But for actual production work where I need reliability, I use Runable for visual content. It's not self hosted but honestly the output quality is better than anything I've been able to run locally. I feed it a long form post or a product update, and it autonomously breaks it down into carousel slides, picks layouts, generates images, and outputs everything sized for LinkedIn, Instagram, and Twitter. That's a multi step planning process. Definitely agent behavior. Local can't touch it yet for this specific use case.
Rachit_sri@reddit
Hi, I am treating LLMs as a intern humans who knows how to code at some level. when they enter a company each company creates processes, railguards and pipelines to minimise human errors as much as possible. this is what I am experimenting in the repo. any suggestions or help is welcome.
https://github.com/rachit1994/ai-agent-generator
shinji@reddit
my work is going hard on it. Lots of experimentation and in dev, people using claude code hooked up to AWS bedrock with beads for context. Others experimenting with claude teams. We also have a bunch of agent tasks in the CI pipeline that can be ran for stuff like merge request description and changelog, code-reviewer agent that comments on merge requests. Gitlab and Jira MCPs are in place now. We also have a Slackbot with the complete company docs and knowlegebase and code repo access.
We have a datadog dashboard that shows how much everyone's spend is on the claude bedrock stuff and it's huge. I see some devs using $100+ a day. It was a total of $4000+ in a week for everyone and quickly rising. Almost all code now is generated.
It's just a matter of weeks or months until they hook up that Jira MCP and Gitlab together and start letting agents pick up bugs with zero dev involvement.
The writing is on the wall.
EnergyNational@reddit
The irony is the more they use it the more it improves and only a matter of time before that one dev using $100 of credits in a day, prompting 'fix this', will be replaced with a AI that does exactly that.
jeremyckahn@reddit
I use Claude Code for basically everything at work. I'm a senior software engineer, but I don't write code anymore. I direct Claude to do it all. Typically I one-shot my way to 50% completion and then iterate and refine my way to 100% with followup prompts.
bigh-aus@reddit
The one thing that I will say is we all collectively need to STOP using the worst possible language for agentic coding. Building a CLI? don't build it in typescript, ruby, python - build it in go, rust or zig. Are these harder to get right? yes but they provide stronger guardrails against slop, and are compiled and efficient.
I'm Heavily using agents and looking more and more how I can remove myself from the loop. Starting to investigate fully autonomous agents that can code, review etc. The jump from agent to factory is a big one however. Getting things right too is tricky.
I'm also doing interpreted language takeouts of some oss tools - Currently it's me using codex / claude, but it's actually really easy to convert something - eg bitwarden cli from nodejs to rust. Next up will be to have a team of agents do it automatically.
LickMyTicker@reddit
The problem is that training sets actually steer some of this. LLMs are undoubtedly going to affect programming choices more than people going forward.
NefariousnessFar2266@reddit
that's only true if you have no idea what you're doing. if you're a developer proper, you're not just offloading the entire choice of your technical stack to the whims of the LLM (which to your point will almost always be TS or Python).
I'm mainly pointing out legitimate developers who may THINK they will get better results going with TS / Python because of the sheer popularity but in reality you'll have a better experience using a top 10 compiled lang; best of all worlds.
NefariousnessFar2266@reddit
LLM's are great at generating Go, no reason to not use it; it's so simple - IMO it is the best backend lang for LLM oriented devs because you get all its toolchain benefits (while escaping having to write its less than desirable syntax).
bigh-aus@reddit
All the sea lies I’m building are written in rust but it was the language that I used before that.. and I totally agree with you. I’m actually surprised how competent most owls are even riding rust applications.. as long as it is compiled and optimised. I think that’s the main point but it does drive me nuts when somebody build something cool and then there is no valid reason that it needs to be written in an interpreted language. The whole npm install fad makes me super frustrated.
xienze@reddit
When I see stuff like this I gotta ask, did you ever have a true passion for software development or is it just a thing you do to put food on the table? I’ve been working with computers my entire life and I get tremendous satisfaction being able to solve new and interesting problems in elegant ways. The money makes me feel lucky to have such a talent. Being a glorified manager directing “Claude” to shit out solutions just seems like such a soul sucking thing. Don’t get me wrong, I love playing with AI but I want to write code and just want AI to help me get unstuck. It’s wild to me to see so many developers so excited about doing what’s essentially boring project management shit.
NefariousnessFar2266@reddit
IMO you're just not very creative if you think this way. There's so much intellectualism to be achieved in SWE outside of manually writing code - if you just see it as "essentially boring project management shit..." I don't know what to tell you.
Though I suppose it's not fair to assume you're not creative, maybe you are and are just not willingly / have yet to try chasing satisfaction outside the bounds of syntax generation.
Eastern_Committee_38@reddit
I think that train has left the station. The future is present and vice versa. AI agents opens up novel ways of solving problems and lots of problems which were in the back burner. So far AI is living off on investors but we have no idea when AI companies will actually making profits
jeremyckahn@reddit
I love programming. It's my favorite hobby and have dedicated much of my life to being as good as I can at it. It's just my hobby now though, not my profession. I'm grateful for the many years of my career where coding was both my hobby and my profession, and I'm a little sad that those days are behind me. But that's life, things change. I'm choosing to evolve with the industry because I need to optimize my employability. It's the pragmatic choice, though maybe not the romantic one.
So it goes. 🤷
LickMyTicker@reddit
Yea. I don't know why people are letting AI kill their love for programming.
As someone who also likes to play chess, it's not like I'm just going to call it quits because a computer can do it better 100% of the time. There's realistically no reason for anybody to play chess because engines have surpassed us, but it doesn't stop anyone from playing.
Programming is fun. Even pointless hacking like injecting assembly codes into roms. I'm not going to not enjoy it just because it's dated. I also think programming alongside AI is fun.
What is soul sucking is unemployment. I just want to thrive professionally so I can enjoy myself.
ken107@reddit
Not everyone is the engineering type like us, who revels in the mechanical inner workings of things. Sadly this is the end of all engineering. The process no longer matters, only the end product does. It's a massive, profound change in the intellectual landscape, and nobody knows what things look like in 10 years. It's insane to happen in our lifetime, that we'll be all be here to witness it.
dtdisapointingresult@reddit
Not the OP but speaking for myself, I never cared about writing code. It's a means to an end.
I've been scribbling occasional notes for years on that one white whale project of mine, a videogame I thought I'd never have the time to make in my lifetime. (AI is making that more of a reality in the next couple of years). Of the years of notes I have, guess what, NOT ONE is about programming or implementation details. I simply don't give a shit about that. All my notes are about the story, gameplay mechanics, and even soundtrack.
For similar reasons,
It's not just games. It could be anything. "I wish I had a cool mobile app to do X"...I don't care how the app is written, just that it looks how I want and does what I want.
You claim this is soul-sucking? It's the opposite for me. It's liberating me from the soulless math machine that I've been forced to use to bring my ideas to reality until now. People like you actually confuse me. How can anyone care more about internal math than about the creative/director side of things? Is it the type of autism that makes some people love math challenges?
steosumit1335@reddit
Can someone please share their views. How much of an AI would you use to generate code if you are learning something new (say a new Python framework)? Is initially reading/understanding the documentation more important than writing code manually? Generating code without writing experience feels I have no control
jeremyckahn@reddit
You can't really understand deeply when you skip over the "learning deeply" part of the development process. That's one of the tradeoffs of agentic development. I gain control by generating guardrails I do understand (like tests) and validating the generated application code against that.
fulgencio_batista@reddit
How do you direct AI? Are you writing detailed prompts, telling it what algorithms/implementations to use, or anything like that? I haven't tackled a big code project in months, but I've had one I've left hanging because it reached 12,000 lines and my knowledge of programming wasn't sufficient enough to be a good director - I guess.
Global-Complaint-482@reddit
We’re using Claude in the spec process as well. Product people write the business requirements as an issue in Github. After reviewing the high level requirements, they set a tag, which triggers an action for Claude to review the requirements against the codebase.
Claude develops the PRD with tech specs. Product reviews and iterates via comments to Claude, then re-tags it. Tech takes over, and a dev reviews the tech specs, iterate with Claude, then re-tags it.
Action runs and Claude now takes a crack at making the changes and generates a PR. Dev reviews, tests, iterates.
We’re able to cut a ton of dev time this way. It also allows product to be more involved in the technical shaping.
EnergyNational@reddit
But isn't claude really bad at writing tests. I have found it tends to write tests to pass, not to actually test edge cases?
Global-Complaint-482@reddit
Depends how you build it. You can also write the tests at the beginning, after the specs, so it’s not writing to pass. Given context of the repo it’s not too bad at writing tests.
SvanseHans@reddit
How many service do you have? And how many lines of code?
Global-Complaint-482@reddit
It’s run on a mono-repo with multiple services and tools. We’re working on a cross-repo workflow next, as we integrate with different teams/products often.
JakeModeler@reddit
Use this https://github.com/gsd-build/get-shit-done
jeremyckahn@reddit
It depends on the task. When I want to create a greenfield feature, I iterate with Claude on a highly detailed ticket and then file it. I minize specification of implementation details and keep the focus on acceptance criteria (as though a human was implementing it).Then in a new session, I feed Claude the ticket and set it off for implementation via Plan Mode. From there I iterate with Claude to close the inevitable gaps and get the code into production-grade shape. Throughout the process, the prompts get more focused and surgical.
Smergmerg432@reddit
How do you double check for security leaks?
jeremyckahn@reddit
I self-review all code before merging it, generated or hand-written. So, I do my best to catch security issues etc. during that phase. I also have AIs review my code, and they help look for security issues as well.
Space__Whiskey@reddit
The vibe is strong with this one. May the vibe be with you on your journey.
jeremyckahn@reddit
I vibe code the initial implementation, yes. It's the quickest path to success (at least superficially). But I take the time that's needed to ensure the code is of professional quality (scalable, robust, secure, tested, etc.) before making a PR.
Space__Whiskey@reddit
as one should in 2026. you are a founding father of vibing, as am I. It is the future. In the future, we will say stuff like "back in my day, we had to iterate and refine after a one shot vibe".
vr_fanboy@reddit
same, 15 year swe here. my impostor syndrome is all over the place nowadays, yesterday a CC instance had to update a sft unsloth pipe to train qwen 3.5, it has direct access to the server, went to see how was work after 1 hour, it was in a really long fight finding 'bugs' and monkey-patching stuff in triton 3.2.0 directly in the env package, holy fucking shit, i gave you the unsloth guide, just bump triton to 3.6.0 for the love of god.
_bones__@reddit
As a senior software developer, I can't imagine burning tokens to bump a version. Like wtf are you doing.
It's stuff like that which makes me happy I can't use AI as more than a junior consultant at work. Even then, it's clear it just isn't very capable.
Glad it's working for you, I hope it won't explode in your face down the line.
elswamp@reddit
You sir—are frustrated.
Tr4sHCr4fT@reddit
this is called npm syndrome
claygraffix@reddit
100%, same here
iktomi3000@reddit
b
_pr1ya@reddit
Same here. Claude is super good for my official work and so many side projects that I have halted in the past, came to life.
Original_Finding2212@reddit
I think you will appreciate this:
https://github.com/OriNachum/claude-code-guide
It’s a guide as a plugin to Claude code, with a daily task to stay updated (supervised), and my own interpretation of features, based on my experience senior dev, DevEx team lead and AI Expert at work.
(I moved to Data Science lately)
I also maintain a NoteboomLM based on it, and have more upgrades coming.
I appreciate any stars if anyone interested or wants to support ⭐️🙏🏿
hoolieeeeana@reddit
It feels like adoption is growing but still very experimental for most people rather than fully integrated into daily work, have you tried building something small with Horizons to test if it actually improves your workflow? You should try it with the discount code vibecodersnest!
Sawati_sharma@reddit
We’ve started using them in small but meaningful ways not as “fully autonomous agents,” but as operators inside workflows.
For example, handling things like lead tracking, follow-ups, task updates, and basic coordination. It’s not flashy, but it saves a lot of time on repetitive work.
The biggest impact we’ve seen is less manual effort and fewer things slipping through the cracks.
We’re doing this through WorksBuddy, where AI is built into the system (like Lio for leads, Prax for projects), so it’s not a separate tool you just see work moving without constantly managing it.
CallmeAK__@reddit
The conversation has shifted from "can agents do this?" to "how do we manage 15 of them in parallel?" Gartner is even reporting that 40% of enterprise apps will have embedded agents by the end of 2026.
Most people are still stuck on one-off browser tasks, but the real scale is happening in multi-agent orchestration. We’re seeing a massive move toward "perception layers" that let these agents actually search video and unstructured data to make better decisions without manual prompting.
Wise-Lie-3867@reddit
The question I stopped asking: "Can it do this?" The one that matters: "Does it finish the task or just move it one step closer?" Those are two very different products pretending to be the same thing.
Own_Professional6525@reddit
Starting small has worked best for me-focused agents for specific repetitive tasks.
The real impact shows up when they’re reliable and actually fit into daily workflows.
StardockEngineer@reddit
All I’ve been doing for the last two years is building agents.
breakyourteethnow@reddit
Is it profitable? Something anyone get starting doing? I'd like to try to sell AI agents to businesses
eibrahim@reddit
I've been running a persistent AI assistant that handles a surprising amount of daily ops - checking emails, monitoring social mentions, drafting responses, keeping track of project context across sessions. The key shift was going from "I open ChatGPT when I need something" to "there's an AI that's always running and proactively helps."
The stack is OpenClaw (open source) connected to Telegram and Discord, with Claude as the main model. It runs on a small cloud instance. The persistent memory is what makes it actually useful vs just another chatbot - it remembers decisions from last week, knows my projects, and can pick up where it left off.
Biggest impact: I stopped context-switching as much. Instead of remembering to check five different things, the assistant monitors them and flags what matters. It's not AGI-level autonomy but it genuinely saves a couple hours a day.
GideonGideon561@reddit
Working in a marketing agency. AI agents seriously does help alot.
It does not REPLACE people but helps people work way faster and more efficient. takes alot of the heavy lifting off ESPECIALLY when it comes to research, ideas and finding partners/creators. Good for copywriting assistance too (ASSITANCE NOT COPY WORD FOR WORD)
woahdudee2a@reddit
AI is handy for finishing a for-loop or suggesting a variable name. coding requires real intelligence. once the hype dies down, real engineers will remain
EnergyNational@reddit
Yeah, I cant imagine having an agent build a project have zero idea how it works and then release it. Devs are important because they can make the design choices, see bad code, fix it ect. If an LLM generates all the code and you can't understand it then thats not development thats just buying an app from a cloud provider.
Madeonz@reddit
I use openclaw and Claude for most. I still use other LLM since i dont wanna waste tokens on Claude, i prefer deepseek over chatgpt.
But im also trying out superclaw .ai (There are many superclaw) and pairing it with claude too so i dont need to get a VPS since superclaw does that for me.
Im still keen to try out claude cowork vs openclaw
Im not a developer
Hiringopsguy@reddit
More than I planned to, genuinely. I stopped using it as a search engine and started using it as a workflow layer and let it handle the repetitive 80% tasks, while I handle the judgment part only.
For voice specifically, local models struggle with latency so I've been looking at other options. There was one that came up in that search and I am a fan of that now.
LegitimateNature329@reddit
Running agents in production daily. The honest answer: they work well for narrow, well-defined workflows with clear success criteria. They fail badly when you give them open-ended goals and hope "reasoning" fills the gaps. The breakthrough for us was shifting from "the agent figures it out" to "the agent follows a constrained execution plan with human checkpoints at high-risk steps." Less autonomous, more reliable. The 80/20 is in the tool design, not the prompt engineering.
andre482@reddit
I do audits on marine vessels and for writing reports i use copilot agent with connected database of regulations. I made strict rules for him and it works so far. Save around 50% of time.
PracticlySpeaking@reddit
Did you build that yourself?
Did you have to compile a database of regs, or was that already available?
CraftySeer@reddit
Find the regulations. Save them in a text file. Put that in your Claude folder. Tell the AI to obey the regulations or mention what is missing.
andre482@reddit
In Microsoft 365 Copilot you press Create agent —-> In knowledge section you press upload and add file you need to use for reference. After you create rules in same agent menu ( i used Cursor to look through observations i like in other report and asked him to create rules for structure and what quality i want. And most important what i dont want, but its more complex as you will see firstly your reports of quality you not accept). During inspections i do draft report and i just feed draft observations to agent. So far was working pretty well, but in future would be great to feed full draft report and put on autopilot.
SettingAgile9080@reddit
Been feeling like I was falling behind in this area, so it's kind of a relief to see that even here - where it's full of early adopters - most people are also using it for coding plus a few other experiments.
Outside of coding (Claude Code) and researching topics (Opus or Perplexity), a couple of things I've set up:
Been trying to get some of this working with local LLMs but still find myself reaching for foundation models for actual work. I feel like we're close - this year perhaps - to an inflection point where small local models start to play more of a role in these pipelines. Maybe not for all of it, but for the simpler steps.
golmgirl@reddit
recently figured out a solid workflow to have claude code autonomously start experiments, babysit them, analyze results, tweak and try again, etc.
totally mind blowing experience tbh
CleverJoystickQueen@reddit
care to elaborate? Sounds really cool!
golmgirl@reddit
nothing that complicated really. just define a bunch of tasks, some hyperparams that affect performance, example commands for training and eval, pointers to important code and data and configs. tell it to start running stuff, keep track of results, run other stuff. it will figure out the details and develop its own little reusable utilities. you gotta watch closely and poke at it and redirect occasionally until you trust the methodology
Ok-Ad-8976@reddit
Yeah, I just have it do benchmarks. When I want to benchmark a new model, I just tell it, here's the new model, download it, set it up, look on Unsloth for directions, and then go to town and tets in on all our inference hosts. and give me some plots at the end of the day, and then I go to bed. Next morning I have results.
I mean, first time I babysit it through it, or it babysits me through it, and then we document it in a skill. And next time I can just be like, if there's any questions, we have skills, use your skills. And it's usually pretty good at figuring out what to do.
I have my whole homelab being run by claude via ansible and terraform and it's getting pretty complicated, but we've been managing. The key thing is to put a lot of effort in the beginning defining a good architecture. Once the scaffold is in place, it can kind of infer what to do next.
aaronautt@reddit
I'm a senior sw dev and I use it everyday, the company pays for copilot which I use with vscode. I have it write 100% of scripts, about 50% of new C code and I use it to review anything I've written. I also run LLMs on a server at home for my personal projects.
thejacer@reddit
I connected local llama.cpp to discord and a custom desktop app with access to a few custom tools and brave search mcp. So I don't really google stuff anymore. I just ask Cortana (because ofc i named it cortana...)
No_Success3928@reddit
Do you refer to yourself as master chief 😂
sibilischtic@reddit
only uses heretic models
thejacer@reddit
Can’t believe I missed the opportunity to do this…unfortunately I haven’t found a heretic model I like. The LLM runs a bot in my kids discord as well and the heretic models can’t follow instructions well enough to NOT say adult shit in the kids discord lol.
sibilischtic@reddit
im currently enjoying DavidAU's glm4.7 heretic. i havent tried to push its language is though.
i would suggest for a kids channel you want it to be atleast a two stage process. where it forms a response then needs to make it appropriate before sending...
thejacer@reddit
I’ve only ever used highly ranked models with reliable instruction following in their channel. Never a 7b vicuña or Hermes or whatever fun but flexible models were popular. I actually tested and declined to deploy at all until llama 3 70b via openrouter. Since getting 2xMi50s I’ve been using GLM 4.6 V and Qwen3.5 35b. With these models I simply put the kid safe instructions in the system prompt and it has been perfect.
thejacer@reddit
obviously, but only after 9:45 pm on Fridays.
Loud_Economics4853@reddit
Agents turn my meeting recordings into action items, code agent writes my tests(no more edge case hell),and even drafts client emails.
last_llm_standing@reddit (OP)
can I ask how you built these agents? any specific library or techstack or did you build everything from scracth?
cinaz520@reddit
I just use Claude for meetings with a Jira integration and granola. I told it once to create Jira tickets from the most recent meeting transcript, iterated and got them into Jira. Then I told Claude to create its own skill based on my chat session. Ran it through a couple test meetings. Works flawlessly for my use case. It’s nice luxury. Can vibe out in team meeting reviewing and brainstorming together without worrying about the details getting lossed etc.
No_Success3928@reddit
Is that when the system goes to sleep and you can pretend your in charge again?
thejacer@reddit
“The system” is a weird name for one’s wife…
But yes 😭
Effective_Motor_4398@reddit
Bahahahah
TheLostWanderer47@reddit
We use agents mostly for research + monitoring workflows. Example: an agent that pulls updates from competitor sites and industry pages, summarizes changes, and posts a weekly brief to Slack.
Stack is simple: LLM + scheduler + tool layer. For web access, we wired it through Bright Data’s MCP server so the agent can fetch live data reliably instead of running fragile scrapers.
Biggest lesson: agents are useful when they have good tools and clean data, not just a model.
Spare-Might-9720@reddit
Totally agree on “good tools + clean data” being the whole game. The other thing that helped us was separating “fetch” from “interpret.” One job just crawls via Bright Data / MCP, normalizes everything into a simple JSON schema, and stores raw + diff by selector or section, then a second job runs the LLM pass over only what changed. Cuts token burn and avoids the model hallucinating from half-updated pages.
If you ever want to pull in internal stuff (CRM, support DB, etc.) to enrich those briefs, we’ve paired things like Kong and Supabase, with DreamFactory sitting in front of SQL as a governed REST layer so agents never see raw credentials or direct database access.
theagentledger@reddit
The unlock for me wasn't replacing individual tasks — it's batching the ones I'd context-switch on anyway and letting them run overnight.
goyardbadd@reddit
Claude is mostly used for my day to day work with the DOD. Im a cloud engineer. I make modifications to my code but other than that i think its pretty good for its useful intent.
xxtherealgbhxx@reddit
I'm probably going to get a lot of hate and ridicule for this.
I'm not a coder at all. I can just about understand enough python to vaguely understand what is going on. I've never written an app in my life. What I do have is an excellent and very broad understanding of technology at all levels.
I have JUST finished a full 40000 line application entirely and wholly in Claude Code. It does everything I need, customised specifically for my use case. The learning curve was real. I wasted a lot of time getting used to managing context and keeping Claude on track. It took me 3 weeks start to finish and the app is staggering for what it does.
A few things struck me.
It worked because of my general IT knowledge as Claude needed a LOT of nudging in the right direction as it wrote. Without guidance it wasn't quite clever enough to always get it right. I did have to refactor the code as it tended to let the core app grow to 1000's of lines. Claude doesn't seem to be anywhere near as good with context management as Codex is. Clade is ungodly expensive if you just let it do its thing. Splitting everything up and working on small chunks iteratively was the only way to keep it on track and focused.
But overall it was stupidly good. I am lucky my use case was an internal tool used by only a couple of people so bugs are not an issue I worry about. That said it's been used, stable and functional for a week now without a single bug showing up. Don't get me wrong, there were 100s of bugs fixed. I went through 4 complete rounds of security reviews letting it detect, fix and test holes. I'm sure more exist.
I'm certain a good seasoned coder would rip it all to shreds as trash but in reality I'm betting it's as good as (and probably a lot better) than many coders out there. After 30 years in IT one thing I've learnt is the 10/80/10 "Rule" applies to coders just as well as everything else.
dtdisapointingresult@reddit
No you're doing things right. Tools exist to save humans time. They used to have to train longbowmen for life, then they found out they could arm a farmer with a crossbow, give him 2 days of training, and it's good enough for 99% of cases.
my_name_isnt_clever@reddit
This is super interesting. I have a feeling your codebase is better than a lot of the professional developers being discussed in this thread.
xxtherealgbhxx@reddit
Maybe, the funny thing is I wouldn't have a clue how to work that out. I meticulously documented the code (well Claude did) and every single function is separately documented. The code follows a standard framework as I knew how important that was when I started. When I checked what Claude had written with Codex it was pretty impressed, identified the structure and clearly understood the code. Lots of confirmation bias I'm sure but I can't ignore the app works, is stable and even crossing LLMs they can decipher the code and make changes.
One thing I do agree with though is that unchecked it's going to lead to some truly awful code and disaster. Even after my 4th round of security fixes it was still finding more. I can imagine many people would bother with any. My code had input validation issues, sql injection, data leakage and more and almost certainly still has some. But doing nothing would leave them all in and that's scary.
Spiritual_Rule_6286@reddit
Like the top commenter, I've almost entirely stopped hand-writing boilerplate and rely on agents like Claude for the heavy lifting, but throwing everything into one massive prompt usually leads to unmaintainable spaghetti code once you hit those 12,000+ line limits. To prevent my core backend logic from getting bogged down by endless UI refinements, I strictly compartmentalize my workflow by letting my primary agent handle the architecture while offloading all the frontend generation to a specialized UI tool like Runable. This hybrid approach gives you the massive speed boost of AI assistance without overwhelming your primary context window with tedious React component updates
last_llm_standing@reddit (OP)
can you give a high level example of an app demo, which part do you give to claude and which part to other ai assistant and how do you connect these two?
CraftySeer@reddit
I made a little document organizer because I always have a huge pile of papers on my desk. It’s great for scanning receipts from my phone. I can use AI to ask for specific documents or sets of documents. It works really well actually. Going to implement a harder database for tallying of receipts and some categorization so I can get some reports for taxes which shouldn’t take long. Super useful. I guess there’s probably solutions already out there but why buy it when you can build it yourself? And I get to keep my own data private.
last_llm_standing@reddit (OP)
happy cake day! also whats your tech stack for building agents?
Comfortable_Ad_8117@reddit
I am a senior applications specialist for a company of 10,000 staff - and we have so many freaking apps I can’t remember what’s what! I use trilium to house all my notes along with a home grown API that funnels the notes into a vector database. I built an Ai widget that leverages local Ollama and the vectors to interact with all my notes so I can ask questions - What is the account number for xxx? Who is responsible for yyy software? Do we have any information on zzz?
my_name_isnt_clever@reddit
I do that for <100 staff, that sounds amazing.
Street_Smart_Phone@reddit
We’re using GitHub copilot and cursor. I use it every day. I’ve gotten to the point I point it to a Jira ticket and it does everything end to end.
rebelSun25@reddit
LoL... I'd like to know which industry you work in?
Street_Smart_Phone@reddit
Startup. 20 employees.
_bones__@reddit
Perfect use case. Rapid iteration, quality doesn't matter as much, as the functionality changes too often anyway.
Wouldn't want to use it in my established code bases though.
Street_Smart_Phone@reddit
Lots of fast iteration. We have big paying customers. Uptime and reliability is paramount. I’m following through and reading what it is going through. When it doesn’t do something right the first time, I amend the AGENTS.md file to guide it better next time. Because of this, I’ve build this AGENTS.md file so that it knows how to grab Jira tickets, it knows how to find the code bases, it knows which AWS account and Kubernetes cluster it needs to inspect. It knows how to grab logs. Lots of our infrastructure is now in IAC so the agent can understand much better.
Also, to be clear, this only works with the latest SOTA models (GPT 5.2+/Claude 4.6). Kimi k2.5 and the other models cannot keep track of all of the information properly and skips instructions. I’ve only been able to do something like this the beginning of the year and I’m still pushing it further but every time, I’m surprised.
hardcherry-@reddit
Not today Kash
tvnmsk@reddit
Openspec development workflow for coding using Claude code. Jira automations with Rovo. Some Claude agents sdk inside of GitHub actions if you need automations close to the code. Screenshot validation using Claude ecode agents sdk (non determenstic output of the software). First line on call agent, escalate to human when needed, ..
wildhood2015@reddit
Using Github Copilot that is given by our organization. I mostly use it with VsCode and must say pretty helpful. Still had token limit per day but i never crossed it. Most of times use it to generate documentation out of old code that no one knows, analysis of issues, generate some powershell code for small stuff, etc. But even with Sonnet 4.5 at times it gives wrong results, so have to take its output with grain of salt.
3dom@reddit
After months of experiments - not me, the hardware is too weak (M1 pro macbook). Waiting for the deep M4 Max discounts or a M5 replacement in couple months.
My play-station is fine (4090 + 64Gb RAM) but I'm too afraid to burn it to experiment with the LLMs. Got a sad experience recently.
last_llm_standing@reddit (OP)
i have the exact same harware, which open source model were you able to get the best results from?
3dom@reddit
"The best" would be an exaggeration, used gpt-OSS-20B and Qwens up to 32B and so far everything is a nuisance requiring a lot of tinkering, comparable to me doing the tasks myself. I suppose they could be much better once fine-tuned to / focused on 1-3 programming languages.
As a programmer I get actually useful results from the cloud Claude and chatGPT. To the point there they solve serious master-level defects where a group of five seniors couldn't resolve them instantly.
PracticlySpeaking@reddit
I tried some basic ("write the game snake in python") tests with those smaller open-source models, and was consistently disappointing.
They could write the game by reproducing a version they had seen before, but when I started asking for things like changing how the controls worked (left-right relative to the head instead of left-right-up-down) they got confused, started creating new errors and other problems.
mohdLlc@reddit
I have been using AI agents for at least a year. Recently there has been an inflection with everyone and their mom and grandmother picking up these tools. But the early adopters have been doing agentic coding for a while with tools like Aider.
last_llm_standing@reddit (OP)
so based on your experience, what do you think is the best framework out there for building agents on your own?
mohdLlc@reddit
No framework is best framework. I have never been sold on langchain/dspy/agentsdks from anthropic/openai. LLM agents just need a formal structure for tool calls. I have written a few coding harnesses like this: http://github.com/computerex/z
None of them use frameworks. For work we don't use frameworks. And one experiment we are doing is *literally* using Claude Code as the agent. Coding harnesses are general purpose agents/orchestrators. We use Claude Code with -p and --resume wrapped in a stateless HTTP api service for on demand access to Claude Code for general agentic orchestration.
last_llm_standing@reddit (OP)
id prefer to build one from bare metal too, for calude code, can your recommend a decent setup, do you use vscode,+ cloud code, or do use the claude code cline and plug it to an open source model?
epyctime@reddit
wot? he literally fucking told you, he has a stateless http api service wrapped around the claude-code executable, are you ok?
howardhus@reddit
whats the difference between your „agentic coding“ and „vobe coding“?
WeekendAcademic@reddit
Aider is OG but not as nice as the harnesses that came after it.
last_llm_standing@reddit (OP)
Aider vs Calude code, who comes on top?
anyesh@reddit
On my day work we have adopted cursor for some green-field coding tasks like removing RFs, cypress tests etc. On my own projects I use claude code for everything controlled by hooks to stay on track and prevent drifting as just CLAUDE.md is not enough.
Most interesting one is, for personal stuff I have built my own personal assistant that has access to various tools like reddit search, web search, maps, weather, voice, personal kb with rag and more… that I use to chat, simple Q&A, research, etc. works very well. I am planning to opensource this soon. https://github.com/Anyesh/calcifer
last_llm_standing@reddit (OP)
This looks cool man, Im new to CC, can you tell me how you setup your hooks so it doesnt forget?
anyesh@reddit
My setup is not perfect but I have used similar setup in most of my work and it just works lol. I asked claude code itself to create hooks.
I use pre/post tool use and stop hooks to make sure it doesn’t drift. They are all bash scripts that blocks drift and injects context into chat so cc corrects.
One example is I try to maintain layers in my backend code and make sure cc doesn’t mess it up with “simpler” approach. I have preetooluse hook like this:
PreToolUse hook: Enforces backend 4-layer architecture boundaries
Blocks: db-in-services, raw-db-in-routes, domain-no-dependencies, business-logic-in-repository
LocoMod@reddit
You silly 3 day old bot. Bots aren’t curious.
last_llm_standing@reddit (OP)
beep bop
ArchdukeofHyperbole@reddit
Not at all yet. I mess around with llms quite a bit for conversation and questions. I usually try out new models that my computer can handle, especially when there's some buzz on a new model, but I haven't really got into agents at all yet.
Stitch10925@reddit
Same, I'm kind of failing to see how I can set it up. I'm trying to run everything locally and have OpenWebUI running at the moment.
I don't want the tool to do the coding for me because I rather like coding myself. Assist me, for example during refactoring, yes. But to have it check PR's for obvious mistakes or to have an agent to run secirity/pentests would be AMAZING!
Anyone willing to point me in the right direction? Would be much appreciated!
brick-pop@reddit
I would rather ask, who hasn't?
last_llm_standing@reddit (OP)
me
tmvr@reddit
My devops stuff is running in the cloud therefore Claude only does the code changes in Agent mode through Copilot in VSCode for that, so that's not really what you (or at least I) would class as true agents.
People do use it extensively here, but output varies greatly. There are a handful of people who definitely multiplied their output with it, but I see enough people where I still do not really understand what they are working on for days/weeks sometimes based on what they eventually produced. This is both before and after they started to use AI tools, so no real change from introducing AI into the mix.
I still occasionally have to do stuff in local AD and for that I just one shot it with Claude Sonnet or Opus, but mostly Sonnet. I just give it a list of names or objects and a vague/short description what I want at it spits out a perfectly fine Powershell script with validating before doing any change to an object, error handling, edge case detection, output log etc. which almost works with the first attempt. When I see that something is not OK, it was always due to me forgetting to tell it some small detail. This is a great time saver.
last_llm_standing@reddit (OP)
do you have a workflow pipeline you can share, i see a lot of vibe coding going on which sounds nightmarish to me.
tmvr@reddit
There is no pipeline, it's Claude Sonnet and Opus through the Copilot plugin VSCode.
last_llm_standing@reddit (OP)
there are ways to improve it tons lot, like really really work well, ive been testing it for the past 1 hour. You can setup like a real pair programmer with TTD, and decide how much your want to go into the details of implementation
djdante@reddit
My business is nothing to do with coding - but I use it a LOT - I've built internal apps, two WordPress plugins for use by my clients, I've built a number of workflows that I use almost daily which massively speed up backend work .
Week by weeks agentic work is accounting for more and more of my work.
last_llm_standing@reddit (OP)
could you please expand on the the workflows, how do you built it? any specific template or framework you use?
djdante@reddit
The workflows I build with the DOE framework - I can't recall who originally came up with that - but it works a treat.
Then if i want to automate some workflows, I'll put the workflows into modal and call on them via n8n.
Luvirin_Weby@reddit
Claude for:
Planning for programs: I input requirements and ask it to analyze and plan the solution flow. I then review and change as needed, most changes are by prompting.
Coding: I no longer write code, just review and check, much of the checking is also done by automated tests.
Analysis of things like logs.
Document creation (specifications, reports, documentation, plans etc..) Though I still read through them before sending and occasionally have to do edits.
ChukMeoff@reddit
I quit coding 6 months ago and just orchestrated agents https://protolabs.studio
croholdr@reddit
It helped me tune my multi gpu ai rig, and gave me the confidence to self diagnose myself; i also use duck duck go; it’s ‘incorporated’ as I assume other major search engines offer.
last_llm_standing@reddit (OP)
could you please explain how ti helped you tune your multi gpu ai rig?
croholdr@reddit
i asked the ai about tuning my computer
Mythology89@reddit
Soy ISP local en Islas Canarias, llevo trasteando con la IA desde ChatGPT a nivel general. Hace unos meses "descubrí" Claude Code y empecé a crear un Jarvis privado para que me ayude con la monitorización de la red. Acceso en modo lectura a toda la red, LLM local en ollama con mi RTX 3090 y conectado a mi Telegram... Primero varias sesiones de peloteo con Claude Opus 4.6 definiendo exactamente qué quiero y como, una vez listo, el mismo redacta todos los .md necesarios para subir al LXC donde está ccode y cada noche hace 3-4 tareas. Yo me desierto y leo los update... Esto es una locura y me encanta
grabber4321@reddit
Steps: - use plan mode - write TODO.md - write README.md - use agent mode to implement it
thibautrey@reddit
I use it daily. I made an app for my computer that help me organize it. You can check it out at www.chatons.ai
sammcj@reddit
I use Claude Code for basically everything other than human communication at work, and have done so for the past year (Cline before that). Everything from software development, document analysis and creation, creating slide decks, research, building training material etc...
last_llm_standing@reddit (OP)
this is great, could you share your workflow for newbies to get into the game? Like high level overview of how you go thorugh your pipeline for software development would be great or if you have any recommeded reference
sammcj@reddit
As far as setup goes: https://smcleod.net/2026/03/the-advice-i-find-myself-repeating-every-time-someone-asks-how-to-get-started-with-claude-code/
And just replace Cline with Claude Code in this old post: https://smcleod.net/2025/04/my-plan-document-act-review-flow-for-agentic-software-development/
robertpro01@reddit
Yes, every day, but not really coding, more like explain this, is I make this change, how it will affect the codebase, or do this refactor where I already have tests.
TanguayX@reddit
I have. Been running OpenClaw for about five or six weeks now with sonnet as the orchestrator. But Qwen 3.5 is looking better by the day. I’m getting a ton more work done and moving way faster. Absolutely like having an assistant.
last_llm_standing@reddit (OP)
thats cool! can you give examples where you use open claw?
TanguayX@reddit
It’s processing images for me through either ComfyUI or whipping up Python scripts to process images. I explain what needs to be done and it’s like ‘eh, I can do that in Python’. Ok, you do that.
Or I say, hey, can you make me a contact sheet of all these images? Boom, sheet for review. Done.
Also just handling QC on things. Today, three images out of 300 were missing. I asked it to find the missing three…these have abstract names but are in logical groups. Might have taken me ten minutes of boring work to go through them all. OpenClaw found them in 20 seconds. Then I fixed them. Coworker action
It’s also really good at bouncing ideas for plans of attack of my work.
Sure, some of this stuff is just a good model doing its thing, but with memory, context, and access, it can truly help me get stuff done. I should say, it DOES help me get stuff done.
Ummite69@reddit
Copilot a bit, then had early access on Claude Code for testing and I can't move out of it. Game changer, for around 3 months now.
last_llm_standing@reddit (OP)
can you go into how you use it? like yoy work flow
alphatrad@reddit
Since 2024 bro
Waarheid@reddit
Whole team uses Claude code for software engineering work every day, it's great stuff, radically changed my life honestly
last_llm_standing@reddit (OP)
is there any specific framework you are using to ensure you cover edge cases and your code is not inefficient?
Waarheid@reddit
Not really, no. Mostly just having Opus do that for us in a new context (in addition to our own eyeballs).
WeekendAcademic@reddit
I would surprised if you're not using AI. A lot of coding is repetitive. A lot the rituals/processes we do everyday as devs is repetitive.
vertigo235@reddit
So I took on a new role at a new company in more of an operational role, I am formally a software engineer, and technical product manager. In my operational role at the new company I (before even using AI coding tools), realized that I had more architectural knowledge than our software engineering team. So, I started doing devops and small bug fixes, then I started using AI tools to push that further. I then started developing an AI orchestration layer to add AI tools to our existing application with the assitance of AI, and my own knowledge.
Fast forward a year later, and I'm using primarily RooCode, and Opencode as my own Jr Dev team. The orchestration microservice that I started is now an integral part of our operation, and I'm using my Jr Dev team to work on other critical bug fixes as well as new features on our core app as well.
robberviet@reddit
Coding and searching for new things.
crypticFruition@reddit
I've been building personal agents on Claude's SDK. Started with email triage and research tasks, now it handles scheduling too. The SDK is really straightforward to work with, especially if you're going bare metal. Definitely the right time to jump in if you're building from scratch.
Deep_Ad1959@reddit
using one daily now. I have fazm running on my mac — it's an open source agent that watches my screen and takes actions from voice. mostly use it for repetitive browser stuff: updating CRM entries, filling forms, managing social accounts. the things that aren't worth writing a proper script for but eat 30 minutes a day.
biggest surprise was how well it handles context. I say "update that lead's status to contacted" and it figures out which CRM tab I'm looking at and does it. not perfect but saves me real time.
repo if anyone's curious: github.com/m13v/fazm
ryanp102694@reddit
All day every day. My company has kiro-cli (Amazon). I'll frequently have many separate terminals in different directories for the things I'm working on. I don't have one of those crazy multi-agent workflows, but I'll get there.
I'm a software engineer. I would say it makes me easily 4-5x more productive.
Agents enable me to do things I otherwise wouldn't have thought about doing because it wasn't worth the complexity. Things like data collection, minor bug investigation, helper scripts to automate tasks, etc.
dragonmantank@reddit
How have you found Kiro? I like the planning phases and stuff, but the actual output tends to be super complicated (as in the solution could have been done more simple) and it constantly gets stuck.
ryanp102694@reddit
I don't tend to have issues with it getting stuck. It's definitely verbose though. Overall I've had a very positive experience
firesalamander@reddit
Only corp hosted ones so far. But yes. Lots of agents.
o5mfiHTNsH748KVq@reddit
Would you be willing to elaborate? Are you running autonomous tasks or are they supervised coding agents? You don't have to give details, but would you be willing to share the domains they're being used in?
firesalamander@reddit
Nothing fully autonomous, but both: supervised coding agents, async code hygiene suggestions. Domain: lots of SQL code.
o5mfiHTNsH748KVq@reddit
Thank you, that's actually interesting! I didn't expect you to say SQL!
firesalamander@reddit
SQL is a fascinating use case. Way less training data (raw coffee examples), less libraries like python, but teeny tiny syntax. All the interesting bits are the tables and how they relate, which you can really see the LLMs improving on understanding.
last_llm_standing@reddit (OP)
When you say corp hosted ones, you just use the existing one and you don't build any agents right?
Durian881@reddit
I'm using more of predefined AI workflows to do some specific tasks, vs AI agents.
last_llm_standing@reddit (OP)
oh, what is the difference, can you give an example?
Durian881@reddit
For workflow, steps are predefined, e.g. RAG, websearch, writing report, formatting to specific template, validate sources, etc.
For AI agent, it's given tools which it can decide to use to achieve its objectives and might involve orchestrating with other agents.
last_llm_standing@reddit (OP)
What you described as workflow, is basically what people in my company has been calling AI agents but I see your point, I wish there was more of a hardcode defenition so I can clear up with my colleagues that what they build are not ai agents
Agile_Cicada_1523@reddit
I think copilot calls agents to #1
letmeinfornow@reddit
I do for contract and RFP language. Works well on a variety of PM related documentation. Cuts the number of reviews down to almost none and reduces the number of people involved to generate the content from an SME perspective.
sine120@reddit
Work, yes everyday. It's the only way I can make sense of our huge codebase and obscure documentation
Stepfunction@reddit
We have GitHub Copilot and I make extensive use of it almost every day for software development work. It is incredibly helpful for improving productivity.
dinerburgeryum@reddit
Yeah I recently have for client work. I use a combination of Cline and Deepagents, both utilizing Qwen3.5-27B. Cline for interactive work. Deepagents for Python playbooks that have to get rerun. The Qwen3.5 MoE models fell tragically flat. GLM 4.7 Flash had potential but MLA means agentic horizons are prohibitively slow. I’ve been meaning to circle back to the Devstral line but haven’t gotten around to it.