devs who’ve tested a bunch of AI tools, what actually reduced your workload instead of increasing it?
Posted by Tough_Reward3739@reddit | ExperiencedDevs | View on Reddit | 41 comments
i’ve been hopping between a bunch of these coding agents and honestly most of them felt cool for a few days and then started getting in the way. after a while i just wanted a setup that doesn’t make me babysit it.
right now i’ve narrowed it down to a small mix. cosine has stayed in the rotation, along with aider, windsurf, cursor’s free tier, cody, and continue dev. tried a few others that looked flashy but didn’t really click long term.
curious what everyone else settled on. which ones did you keep, and which ones did you quietly uninstall after a week?
CallinCthulhu@reddit
Most of them
Agreeable-Clerk-4819@reddit
I was also exhausted. I kept Cursor for daily updates, but added Skywork to my rotation of tools for building new versions. It's the only one I've found that can quickly build a complete full-stack application without you having to create every single file step by step. This is incredibly useful for skipping the initial boilerplate phase, so I don't have to worry about managing configurations.
CarefulDeer84@reddit
honestly the tools that stuck for me were the ones that could handle specific repetitive tasks without needing constant handholding. like I'd rather have something automate API endpoint creation or data transformation than try to build entire features from scratch.
I think tools work best when they're narrow and good at one thing. we ended up using Lexis Solutions for automating our web scraping pipelines and data processing workflows, and that actually saved us from writing the same boilerplate over and over. the key is figuring out what part of your stack is eating time and automating just that piece instead of trying to replace everything.
sfscsdsf@reddit
I’m an overworked senior engineer at low level firmware and hardware space, doing testing, devop, infra, hardware, field testing. so i don’t have time and mental space to doing everything perfecting including writing docs, tests, proper comments. Having vs code and copilot doing those save bunch of my time, so I focus on shipping the main features while quickly reviewing those non essential but important software engineering stuff.
Internal_Outcome_182@reddit
Sounds like you are not simple engineer just whole orchestra.
sfscsdsf@reddit
that’s the demand nowadays especially with AIs sadly
Internal_Outcome_182@reddit
Yes, but I still think it's "nerds" fault, learning boundaries would make everyone in IT sector life easier. Being "YES" man is worst thing in IT.
sfscsdsf@reddit
hmm good point, it really depends on the manager, right?
humanquester@reddit
I just use it like a less obnoxious stack overflow where all the posts are by noob devs.
But this last week I've actually just gone back to the old forums to look at year's old posts. Its about the same workload either way.
Clearly there are people here who are using it to write 5000 lines of code a day for their thing - good for them, but I simply can't do that while simultaniously understanding how the code works.
There are apperently many jobs out there where having the devs understand the code isn't very important, or these folks are just smarter than me. Either way, what I'm doing works for me.
z960849@reddit
I might be old but looking at old forums makes you think more. And there's more of an opportunity of you learning other things other than the question.
kincaidDev@reddit
My workload has definitely increased by using ai tools, but my output has also increased exponentially.
For instance, this week I couldn't get to my tickets because of customer support request and qa questions. Used a custom ai tool I built to finish all my jira tickets left in the sprint for me in about half an hour.
Working on another tool today to automate the questions/debugging process for me and output reports I can review so hopefully next week I can actually focus on the higher level problems I'm supposed to be solving instead of having to try and do deep work while getting thrown into random debugging sessions that usually turn out to be nothing important.
If this system works the way I think it will, it'll be the only way I can add new features to the product we launched a few months ago instead of just being stuck doing tech support.
ImprovementMain7109@reddit
Same experience here: most “agents” felt like managing an overconfident intern. The stuff that actually stuck for me is boring: Windsurf/Cursor-style inline completion, plus a repo-aware chat. 80% of the value is just: autocomplete whole functions, rewrite a block, summarize a file. Low ceremony, no jobs queue, no “let me orchestrate a plan” monologue.
For repo-scale changes, Aider is the only thing that genuinely reduced work. Schema change, SDK upgrade, annoying cross-cutting refactor: I let it propose edits, skim the diff like I would a PR, then iterate. That’s the closest I’ve gotten to “I don’t have to hold all this state in my head” without feeling like I’m losing control of the codebase.
Outside that, Continue in the editor with a decent model has replaced 90% of my “switch to browser, open chat, paste snippet” flow. Everything else I tried that marketed itself as an “agent” eventually cost me more time in debugging, explaining context, or undoing weird choices than it saved. The pattern for me: if I can’t cancel it mid-thought and just take over, it doesn’t last.
Careful-Remote-7024@reddit
I guess we're talking LLM ? I use mostly claude code (so in CLI). Github Copilot in Jetbrains product is quite unstable, it will sometimes even fail to just modify the code entering in some kind of infinite loop of "let me retry", which well, consumers your tokens.
Everything strucural can be huge time saver, especially when others examples are already present in the code. For example adding a new configuration in some UI and all the boiler plate code to make it come to the backend.
Note I said structural but design wise it's good to also check after it and often ask him to make the right abstraction layers. Typically, it could duplicate a lot of code doing "check if cache if not call service" instead of creating the right cached service abstraction layer that would remove the need of duplicating those conditionals.
Having also done TDD with it, it's unfortunately not that great once it has too many tests to validate at the same time. It will often goes from "I created functionA for testA and adapted it to functionAB to also cover testB" to suddenly create a functionC that will handle the case of testC and then delegate to functionAB instead of rethinking how functionAB will work.
IMO the biggest limitation is the how unable it has to "reflect" on the big picture which is expected coming from a LLM, since it has no reasoning ability. But nonetheless, it is a precious tool that help a lot for a lot of activities so not going to discard it with a 1-liner.
VanillaCandid3466@reddit
I just tried GitHub Copilot again after getting really, really annoyed with it suggesting total crap. Not even getting the parameters right for a call to a method that already exists in the same damn .CS file.
5 minutes after turning it on again ... off it goes. Utter crap.
serg06@reddit
Yeah that's why people use Cursor and not Copilot. The difference in autocomplete accuracy is night and day.
VanillaCandid3466@reddit
I use Jetbrains IDEs, and I'm not switching just for AI.
Confident_Ad100@reddit
AI tools in JetBrains aren’t the best. Feel free to use whatever but know you aren’t getting the best AI tools there.
riotshieldready@reddit
Copilot is a weapon formed against engineers. It’s so bad it has to be on purpose. It gives me the most random code suggestions, that’s make no sense at all.
DepressionBetty@reddit
Had a similar experience with copilot after seeing some new model hype. It made me so much angrier to work with the LLM than searching docs & examples for myself.
VanillaCandid3466@reddit
I've taken to running LLMs locally on my 4090 using LMStudio. Using QWEN models.
I kick it off on a very specific code problem and let it do it's thing whilst I'm continuing to work on other stuff.
serg06@reddit
+1 Claude Code is amazing. I use it for the first draft of my PRs, and it usually does exactly what I would've done. Last night I had it implement a 600-line feature, and I only had to chanfe about 30 of them.
immbrr@reddit
A lot of folks at my company (including me) are big fans of Augment - it seems to do a better job of handling larger codebases compared to e.g. Claude.
I've also been enjoying NotebookLM for quick-and-dirty infographics and explainers (definitely would be 10x better done by a human, but for the level of effort/time I need to put in to get them done - plus not bothering a designer - it's pretty darn useful).
gfivksiausuwjtjtnv@reddit
Claude code, Opus 4.5
It’s still pretty borderline. But I do pay for it myself because it saves me time and mental energy, of which I have little
it took me ages to learn how to use LLMs. I actually think there’s a bell curve where rank amateurs can vibe code shit, juniors produce slop, mid levels on my team actually never use it because they’re still learning how to design systems and they can’t prompt well or gauge the response accurately
Internal_Outcome_182@reddit
Dunno about opus, but sonnet was spamming too much code everytime, rewriting everytime instead of changing one thing.. annoying as hell.
PoopsCodeAllTheTime@reddit
I use it only for tasks that it can One-shot, or at least tasks that it can get most of the code correct with a few manual corrections. Anything else feels like a waste of time, if it didn't get it the first time, it won't the second time when the context is larger and more difficult for the LLM. Making custom boilerplate is great tho.
Xacius@reddit
Writing docs. Claude Code is great for this, especially with a good system prompt and subagent. "Write like me" is super easy when you already have documentation for it to reference.
iPissVelvet@reddit
Glean has been great for larger companies when connected to your Slack/Wiki and stuff. Great for searching for historical context that’s often just buried in some Slack thread.
Honestly ChatGPT has been great for performing auxiliary tasks that I’m not an expert of, but the experts are busy. For example I don’t bother our data guys anymore when I want to write some non trivial SQL. As a backend dev I can search through and understand front end codebases with Cursor (not writing any front end code, just understanding the data path).
Suepahfly@reddit
AI works great in small greenfield projects if you know what you want. I managed to from idea to working prototype in just a few hours a week complete with hardware design (bluetooth score display sign, that connects to a timer).
I also tried using it in a legacy Sitecore project only to watch it drain my credits and going off incompletely the wrong direction. It basically started a large refactor on parts of the code base it shouldn’t even touch.
Wonderful_Device312@reddit
Only AI that I have allowed to write code for me is kombai. I use Claude and chatgpt to help me write code, but I don't let them write it. They just screw up too much.
Kombai has so far been pretty decent. Still screws up but it's strictly writing front end stuff that is largely declarative with minimal or highly boilerplate logic.
It makes me feel like they might be onto something. Rather than generalist AI tools, specialize them. Kombai only does front end and only certain stacks at that. I think that's a big part of why it works well. It's also focused on a domain where it's relatively easy to validate for security issues, and the price of bugs is fairly minimal.
Organic-Permission55@reddit
I am using Claude Code for tedious tasks where I need to make the exact modification tens of times (e.g. when I change a helper function for tests) and for locating code in large codebases.
ChatGPT (Thinking model + Web Search) I basically use for searching stuff and documentation of external parties.
Anything that really needs quality, that I use my brain for.
miran248@reddit
I use gemini via kilo code from time to time and it's actually helpful at times.
Yesterday i told it to gather all tailwind color classes, used in a specified file, and move them to a specified css file, and give variables semantic names.
After a few prompts i got the naming right and then told it to repeat the task on the remaining files (in chunks).
Within 30 mins i had most of the project using the new color classes. Then i configured the dark mode, told it to add css vars for dark mode and that was it, i had a very ugly but working dark mode theme.
I could do it myself but it would take me hours.
Trevor_GoodchiId@reddit
Perplexity out of all things.
Because it narrows down search queries and summarizes a few dozen sources - it's a quick way to access if the approach exists at all, and is worth investigating properly.
Bobby-McBobster@reddit
For coding, absolutely nothing, AI is trash.
I use it for project management, like give it a quick description of a task and it makes a nicely formatted ticket in the right folder and all that.
Ok_Substance1895@reddit
Claude Code and Amp are the overall best. Gemini 3.0 just came out and is said to have taken the lead. The three of them seem pretty close to each other. The biggest gain is in just letting it type for you. Don't let it do too much, that is where it gets non-deterministic and requires more debugging/time. You have to guide it in small chunks. Git commit each and every time it looks good enough to move forward. You will need to drop changes on occasions when it goes off the rails. You will also have to /clear or /new the session. Don't carry over the session when it goes bad or it will continue to go that way.
tango650@reddit
friend, your tools aren't the problem, you need to learn to work with the underlying primitive which is the LLM itself, you've just started so there's no advice for you other than keep at it, pick a tool (cursors tooling is probably the strongest - by 'tooling' i mean the whole layer of QOL features on top of the LLM) and watch some youtuve tutorials whatnot, best of luck
Big-Discussion9699@reddit
Not really. I've been hitting tight deadlines recently and just helped me to meet them
titpetric@reddit
Sourcegraph Amp free tier. It wanders off and does a good job for small menial tasks, but if you let it go off it's gonna collapse under the weight of it's own bad decisions. Cursor was fine too, but quicker to run out. I could stress test Amp for a few days on the free tier, Cursor ran out after a few hours.
Everything reduces your workload. But things cost money. My use would be $1200/mo with Amp, but the free tier has enough context for smaller features, extended testing, PoC work... Great thing for chore tasks, not so great at software design without some micromanagement
Supports MCPs, so in theory, I could improve the experience around memory, tasks lists,...
grahambinns@reddit
I’ve not been using too much AI of late, as I have switched to Zed as my main editor and – despite the fact that it’s got AI baked in – I just couldn’t be bothered to set it up 😆
However, I did previously work with Codex, Gemini, and copilot, and I found (roughly speaking) the following:
Gemini was fine with generating Python, would often lose its mind when generating rust.
Copilot was pretty good “fancy auto complete“ but hallucinated like mad, particularly when generating rust and elixir
Codex would often claim to have fulfilled the prompt – only for me to find that it did sort of hand waved vaguely at it, producing something that was syntactically correct and which compiled, but which didn’t actually achieve any of the stated goals. To be fair to it here, that could have been down to my prompting being not verbose enough – but by the time I had managed to write a prompt that it actually managed to get cluster fulfilling, I could have done the work myself twice over
I will say this however: the latest Gemini models are excellent at analysing existing code and finding issues (and suggesting improvements). I recently used it to find performance issues with a humongous query that was built in multiple steps from the results of various horribly constructed typescript functions. Whilst I could see where the problems were from the query plan, I couldn’t immediately untangle the mess as I looked at it in my editor.
Gemini was able to highlight the big ticket issues pretty quickly, allowing me to pick off the smaller things one by one later on.
wardrox@reddit
Simplified my stack, kept good docs, increased test coverage and ops. Tool & model doesn't make much difference when you keep things simple.
I also pay a lot of attention to where my energy and focus goes, same for the project in general, and use ai to help with planning and orchestrating.
TheDerperer@reddit
Notepad and a brain.
Internal_Outcome_182@reddit
Stop using shitty AI.