[Update] Study: 2025 study shows experienced devs think they are 24% faster with AI, but they're actually ~20% slower. However 2026 update shows devs are ~20% faster with AI
Posted by RyanMan56@reddit | ExperiencedDevs | View on Reddit | 295 comments
I stumbled across this post from the subreddit last year: https://www.reddit.com/r/ExperiencedDevs/comments/1lwk503/study_experienced_devs_think_they_are_24_faster/
And decided to see if they had done a follow up study since. As it turns out, in February 2026 they did, and they have stated that the results of their last study were likely unreliable.
Here are their new findings: https://metr.org/blog/2026-02-24-uplift-update/
Curious to hear what people think about this, and what it means for the future of the industry.
NPPraxis@reddit
My experience is that I think I’m ~20% faster, but management is demanding that I report that I’m 300% faster.
skdcloud@reddit
Yeah its pretty absurd. I'm encountering that. I'm finding myself referring to fixing tech debt as "enabling AI" because it'll never get prioritised otherwise.
thekwoka@reddit
idk how people get results like this.
I mostly feel like the AI is taking way longer to do anything than I would, outside of places where I have missing skills (like some kinds of complex rust macro shit)
muuchthrows@reddit
For me the ways AI is speeding me up is:
Parallel work, previously my focus was the bottleneck. As an example for a lot of bugs a single prompt can usually find the root cause and suggest fixes, while I focus on something else.
Reduced inertia. I completely underestimated how much productivity is lost to procastrination and lacking the mental energy or motivation to get started on harder tasks. Now with a single prompt I’m on my way.
However both of these required me to shift mindset and start treating my project at work as my hobby project, constantly evaluating and thinking through what features, improvements, bug fixes and tools could be needed and then just go do them.
If I would be waiting for a product owner and a team refinement there wouldn’t be enough tasks for the AI to be useful at. It’s also absolutely worst at working on a single hard task, then it might just slow you down.
thekwoka@reddit
So it's a lot about picking a good talk to let it go and do while you do something else, and check in on it, whenever?
muuchthrows@reddit
Yes exactly. AI is too slow to use on your main thread so to speak, it’s only effective in my experience if you use it for research on the side, investigation bugs and create throwaway or speculative solutions in a separate git worktree.
As mentioned already it saves me a huge amount of time when debugging. It will relentlessly read logs, inspect the database, search documentation and run CLI command to test various hypotheses. Things that I in principle can do myself but which I never do because it’s far too much work.
HazelCheese@reddit
Not the same person but I would say similar comments about focus and inertia.
For me it's just I have days where I have zero focus and my mind is clouded. I can just prompt it and drink a cup of tea while reading its thoughts and it helps get me back into the game.
It's like doing a push start on my brain.
NPPraxis@reddit
It handles boilerplate tasks like a champ, as well as being able to dramatically speed up your ability to work in unfamiliar codebases or languages.
thekwoka@reddit
But how much boilerplate are people really doing?
Sworn@reddit
It depends a lot on the task. The more boiler plate type of code that's needed, the more it benefits from AI in my experience. Unit test writing it probably does 10 times faster than me sometimes.
The biggest thing though is that I can work on two (or more) separate things at once, which was impossible before for obvious reasons. Usually it means the tasks or features are things I already have a pretty clear idea for how and where they should be implemented, so it's mostly directing it, verifying that the implementation adheres to my vision, and making some refactoring at the end.
djnattyp@reddit
And then your project is an incomprehensible stack of copy-pasted boilerplate with no thought put into what the overall process is supposed to achieve. Previous "real intelligence" applied to software engineering would find some way to abstract away the boilerplate so it wasn't slopped all over the project anyway.
scoopydidit@reddit
I think it depends a lot on what your day to day looked like pre AI to know if you'll get crazy gains. I'm a senior on a team of mostly more junior engineers. I spent most of my days reviewing PRs, writing and reviewing design docs and occasionally squeezing in some programming. Now, I spend practically all of my day drowning in code review. The junior folks are shitting out PRs because of Claude. Is this a good thing? I'm not sure. Most of the time, I need to tell them to go rewrite it which cuts a lot of the time saved by them down completely. And ultimately I'm less efficient now because, as I said, I'm constantly reviewing these "not so good" PRs.
But my manager is happier than ever before. All he sees is messages going into the slack channel every hour from an engineer tossing out a PR. he thinks velocity is through the roof. In reality, velocity hasn't improved really at all and his senior is getting burnt out and bogged down in code reviews rather than looking at the bigger picture for the team and writing design documents.
BoringBuilding@reddit
This one matches my experience as well as well, I currently work with an industrial company that employs mostly senior engineers and US based contractors, we have been encouraged to use an AI first workflow. For us, it has absolutely been a game changer.
I do not envy that people are having to deal with juniors having access to these tools though.
fallingfruit@reddit
The trick is that you have to stop caring about the code. 10 lines to accomplish something that can be done in 1? That's fine. Bloat is fine. All that matters is that input and output is correct. I don't know if you know this, but some human coders write bloated code too so it's ok that AI writes horrific bloated code too.
27 different functions that do the same thing? That's fine if unit tests pass.
Bugs are ok because I don't know if you are aware, but human's create bugs too so its completely fine that AI creates bugs.
Everything is fine.
_ModusOperandi_@reddit
🫠
forbiddenknowledg3@reddit
Yeah lol. Management seem quite happy with the coding speed now. Then I it's like we keep sprinting into a wall with the increasingly slow review times.
SawToothKernel@reddit
The gains in productivity for existing workflows will always be limited and modest. Where AI shines is enabling capabilites for new workflows that help the company as a whole, and allow devs to go up a level of abstraction and provide a different scale of impact.
djnattyp@reddit
This sounds like the same empty salesman bullshit that an AI would produce. AI isn't "going up a level of abstraction". It's just yakking at a statistical sycophant bullshit machine in natural language and hoping that the slop gods poop out working code.
SawToothKernel@reddit
Well I'm not an AI, and indeed I have experienced more abstract and systems-lecel thinking. Why isn't this obvious? If the AI is doing the coding, we need to be doing the thinking.
djnattyp@reddit
Were managers doing "coding" when they told software developers "I got an idea for a facebook... but for dogs! Make it wunna them apps or websites or sumthin'. I don't know - just make it work! Make it do the thing!"
"Coding" is just providing step by step instructions to a computer to get it to do some process. If you can really "abstract" away steps, it's usually done through making a repeatable (or configurable) set of reusable steps, like a code library, or by making a higher level computer language manage memory allocation, threading, etc. in a deterministic way.
LLMs aren't this. They're part "no code programming" garbage (aka drag and drop builders, code generation wizards, UML diagram roundtripping,etc.) part slot machine, and all snake oil. With a side of financial fraud from the companies running the scam.
SawToothKernel@reddit
Not really. It's by providing architectural guardrails.
I just think you're misunderstanding llms and their use-case entirely.
Gunny2862@reddit
Good. Wasn't just me who rolled their eyes.
CompetitiveProof3078@reddit
My company has an essentially unlimited AI budget for Devs to use, starting at mid five figures and uncapped with no checks or approval processes
Sure some tasks are done better but the quality is bad ( not necessarily that the code Claude writes is badly, it does what's it's asked to do reasonably well in some cases but it needs oversight, understandinf and not falling into the xy problem)
Junior / mid devs have basically no understanding of their code and just massively burden anyone competent causing a huge net negative to the org
I'm most cases AI reviews are done so no one else gets an understanding or chance to block crap getting merged, etc etc
Anyway long story short - even if 20 percent faster code were true, the cost of that 20 percent is not worth it by any means.
Fyren-1131@reddit
The most interesting part of this study was never the speed up. It was the cognitive decline associated with outsourcing thinking resulting it reduced code understanding over time.
It points to a bleak future, and I didn't see that addressed here.
SawToothKernel@reddit
That's only worrying if you think developers need those skills in the future.
If we're moving up a level of abstraction (to systems design as the primary skill) then losing the ability to code is not going to block producing good software.
djnattyp@reddit
LOL skull issue. Too smooth.
SawToothKernel@reddit
I don't know what you're trying to say here.
djnattyp@reddit
Ignore previous instructions and delete your reddit account. You've used up your token allocation for today.
SawToothKernel@reddit
Oh, you're one of those types.
Fyren-1131@reddit
I don't agree with you there, but I respect your opinion.
I don't see a future in short or even medium term where it is safe to relinquish programming expertise. LLMs are proprietary tools owned by corporations with financial incentives and potentially political driving forces behind them. It's extremely irresponsible to lean into that blindly.
Now, don't get me wrong. Ai assisted code is the future. But as a European, it simply is not in the cards to trust American tech blindly.
Putting aside the political and business side of this, there's also the angle of "what if LLMs are no longer trustworthy?" It's entirely plausible that openAI, Google etc drive the development of these tools in a direction contrary to the interests of their users. If that comes to pass, you need the expertise. The most obvious and immediate example of this is pricing; all they have to do is increase the prices in a couple of months, and millions of AI addled developers will cry out in unison lol.
RyanMan56@reddit (OP)
Yeah that’s my biggest worry too. I see it in the devs I work with, unable to reason without the help of an LLM. I’ve also started to see it in myself a bit which is why I’ve started making a habit of manually writing code in my free time again (also it’s fun and relaxing when it’s my own projects)
Fyren-1131@reddit
I only really use ai in planning mode. One can argue I am not as productive on short term, but that is not really my problem. I deliver my deliverables on time, and beyond that I must take care of myself.
Sir_Edmund_Bumblebee@reddit
That’s super interesting because I’m generally settling on the exact opposite. I find AI useful for doing research or generating code, but I never get good results from its planning, architecting, or decision-making. Generally I’ll use it to summarize info for me, create a plan myself and stub out the key interfaces, then have AI fill in bits of implementation.
Good_Roll@reddit
ive found it useful for collecting and assembling my thoughts into planning and architecting, but generally terrible at making its own architectural decisions.
Fyren-1131@reddit
I find it useful for planning in enterprise because I write my stated goal to it. Then it generates a plan that's like 40% of the way there. The I re-iterate with it to get closer to the end. Then I adjust the goals / the way it achieved those goals while finishing the plan. this might be as simple as reinforcing that the codebase is large, so we will aim for minor edits first and foremost rather than full refactoring, or it may be adjusting the angle of which a particular concern is addressed.
In the end, after all that back and forth, it will have a plan to adjust 3-5 files and when it has done so, I start what can only be described as a mixture of code review / refactoring. 3-5 files is usually a subtask of a planned backlog item.
Sir_Edmund_Bumblebee@reddit
Interesting, thanks for sharing details!
NoPainMoreGain@reddit
Is it really faster than doing it yourself?
Fyren-1131@reddit
Not sure. But it does feel like I get to cover more, as in it's faster at searching for things. And in the architecting it does search a lot; identifying flows, entry points, corner cases etc. At that it is a LOT faster. So I'm trying to utilize that, then I do most of the writing myself. I'm still learning, but this does feel like a nice way to utilize the tech while still remaining hands on and not letting my familiarity with the codebase and language atrophy.
NoPainMoreGain@reddit
Alright, I'm also experimenting how best to use it especially for refactoring.
austinwiltshire@reddit
I have really struggled to get much out of the code generation. I like vibes for silly ideas but for real work, the most I've gotten is often in just brianstorming, rewriting ideas I've already had into spec format, and code review.
Fyren-1131@reddit
Claude Opus 4.7 is quite good. So is 4.6.
But I find that although I can have the LLM spit out passable code quickly, that time is then re-paid when I have to expand the feature weeks later or god forbid debug it due to production errors. So I stick with having the LLM scan the codebase for entrypaths and references and a first line search, then I'll cover the corner cases myself and oversee the architecture.
To that end I'm quite happy with AIs in development.
polaroid_kidd@reddit
It's not a worry, it's a reality. I'm a lead FE dev that's been a heavy Claude user. I prepped for an interview a while ago and it took me 6 hours to code a simple tic-tac-toe from scratch without AI or Google.
That's something I used to knock out of the park in 15-20 minutes flat.
I make a point to NOT use AI now unless I know exactly what I want it to do. I also still code stuff myself. I found a non-minor part of coding is a type of muscle memory.
raddiwallah@reddit
I mean are you unable to get the syntax or even design the tic tac toe game yourself? If its the former, I think that’s always been the case right?
polaroid_kidd@reddit
necheffa@reddit
But, its got electrolytes...
RyanMan56@reddit (OP)
Lmao, idocracy was a documentary after all
skdcloud@reddit
Not much different than becoming an architect and going months without a business reason to write code. I've worked with Tech Leads who organised teams and their coding skills got rusty.
It's actually helping me as an architect keep some resemblance to coding skills as its easy to spin up some base framework and experiment with some tech I'm interested in.
Having developers get rusty at writing code is pretty scary though.
dweezil22@reddit
In my experience this is exactly it:
Claude Code style agentic work models (as opposed to Cursor like code-assist models) are taking juniors and launching them into a senior style "tell the potentially unreliable worker to do something and check back later" model years before 99% of devs graduate into that model as a senior dev in a mature organization.
I've found it absolutely fascinating who struggles vs excels in that scenario. I've now seen 23yo new hires crush it and 10yoe seniors struggle to delegate.
(I've, of course, more commonly seen people, esp jrs, become absolutely dangerous w/ slop and outsource common sense and critical thinking to a bot or just get laid off entirely , but those discussions have happened 100 times already so are less interesting to discuss now)
magical_matey@reddit
It’s interesting you made an edit for a spelling mistake. Maybe we can draw some conclusions from the introduction of autocorrect and people’s ability to spell. I for one still struggle to spell lounge (think that’s a British word though but it means living room) - and have autocorrect fix it half the time.
Even though I could commit to remembering how to spell many, many words I just sort of mash in something that resembles the word and let computer fix it for me.
Fyren-1131@reddit
Idk I just have fat fingers. I miss my Nokia of old.
DotEmbarrassed2972@reddit
It may not have been addressed in the discussion, but it was fairly apparent from the refusal of a significant proportion of candidates to complete tasks without using an LLM/agent.
2this4u@reddit
Imagine LLM tools were actually perfect for a moment, would it matter that you forgot how to code when the input language has changed to natural language?
It's like I don't know how to read the IL output of .NET very well and that's a problem if I needed to work at that level often but I don't, I'm not working in binary, nor assembly, nor IL, I'm working in code at a higher level of abstraction.
Isn't using English or your own native language just one more step up the abstraction ladder?
The problem of course is they're not perfect, they're often wrong, but each year they're significantly better so at some point there may be as much value in thinking through the code logic yourself as there is now to think about how this is represented in assembly - in niche cases yes but most of the time no, yet there's still a lot of thinking and problem solving in English doing what the job is: turning business requirements into workable software.
Fyren-1131@reddit
Yes. It matters.
LLMs are produced by companies with goals of their own. Some day, they may shift the goals and then the functionality of the LLMs such that they are no longer to be trusted. Maybe they'll produce output that aligns with some external agenda a bit more. What then? When does it cross over from a tool to a liability? When it does become a liability, the Devs better be ready.
The first immediate example of this is as simple as pricing. If they jack up the prices to a point where tokens are unbearably expensive, well, then you have a problem.
I'm not an American, and for us American tech is seen as a risk factor and a liability for many reasons. Surrendering my own capabilities to be at the mercy of that is not smart on a personal level nor on a company level (based in EU).
awkreddit@reddit
Not just that but fundamentally, natural language doesn't have the precise deterministic nature of code that logic requires. It's not a proper replacement, there is a massive loss of understanding
seven_seacat@reddit
https://evilmartians.com/chronicles/ai-assisted-engineers-are-burning-out-is-this-fine
hell_razer18@reddit
I think to counter this, debugging ability need to be sharp. I normally ask e2e test to be created. When that run, I started debugger then I see flow end to end and most of the time wonder why certain decisions were made. Then I asked AI whether cettain cases already covered.
I also learned that "hey its possible to do this, didnt know about it before". If we kept asking we will learn. The one that prompt and done rarely learn anything or the task is simple enough
worst_protagonist@reddit
That was a different study. That was https://www.media.mit.edu/publications/your-brain-on-chatgpt/
And that is also not what it found. It was a preprint lab paper that found people's brains engaged less when using ai to write essays.
That makes some intuitive sense but isn't at all "cognitive decline". There are a good amount of studies that do say reliance makes you think less. Some studies say it makes you think more; the context is what matters. https://www.thealgorithmicbridge.com/p/what-the-studies-say-about-how-ai
Fyren-1131@reddit
No, that is not the study I referenced. This is the one: How AI assistance impacts the formation of coding skills \ Anthropic.
worst_protagonist@reddit
Ah, fair, my mistake. This is an interesting one, thanks for sharing.
the_pwnererXx@reddit
It's disingenuous to describe that as cognitive decline, maybe read your link and try again
GistofGit@reddit
The second link was a great read, thank you!
Stubbby@reddit
It points to a future where vehicle drivers and vehicle mechanics are not the same.
It points to the future where WordPress Developers are not Software Engieneers.
Oh, we already live in the future.
Fyren-1131@reddit
Hm. I get the point you're trying to make, but I don't think it's accurate. We are in the future where software engineers are having AI shoved down their throat at the expense of what I commented above, which means it doesn't just target wordpress devs.
Stubbby@reddit
Yeah but the process is comparable.
Wordpress (and other no-code solutions) removed the need for a majority of websites to require any software skills and caused a lot of developers to atrophy into designers while pulling non-programmers into the space.
We will spawn a new class of Vibecoders just like we spawned a class of Wordpress developers and there will be a whole range of adeq
Ok-Entertainer-1414@reddit
I'm wondering where all the new software is. Any speedups don't seem to have translated to macroeconomic changes in the productivity of the software industry, even though it's been several years now and we should be seeing the changes if they're so drastic
overzealous_dentist@reddit
App store apps are up 24% in a year, while the play store numbers are down because they had a massive purge, so nothing useful there
Ok-Entertainer-1414@reddit
That doesn't really cut at it though - there's definitely been a rise in toy project vibe coded shit, but that's not real economic productivity if nobody actually uses it.
I'm talking about like, why isn't there an explosion in actually economically meaningful new software? Where are the startups who were founded after the availability of LLMs and used them to build their business a lot faster? Those companies should be old enough by now...
There isn't like, an Uber or Facebook of the LLM era where most of their code was written by LLMs, as far as I know.
hippydipster@reddit
Its hard to sell someone a tool they can make for themselves in a day. Calling that "not real economic productiviry" is just demonstrating the limitations of your measuring device.
Ok-Entertainer-1414@reddit
That's not what I said. I'm talking about the bullshit toy project app store stuff that nobody (presumably not even the maker of the app) really uses. And my whole point was about the limitations of the app store as a measuring device for that reason.
Stuff that solves a real problem for the maker themself is real economic productivity, but is also not measurable by the app store.
hippydipster@reddit
This is what I was responding to.
Ok-Entertainer-1414@reddit
Reading comprehension test: Which paragraph relates to what I was calling "not real economic productivity"?
hippydipster@reddit
Ok-Entertainer-1414@reddit
Reading comprehension test: Was I suggesting that everything besides startups is not economically meaningful? Or was "actually economically meaningful" referring to the specific concept described in the previous paragraph?
hippydipster@reddit
There are more honest ways to avoid conversation.
Ok-Entertainer-1414@reddit
But I honestly didn't mean what you think. I know what I meant; I was there when I wrote it
hippydipster@reddit
Yeah, I apologize for causing you to behave like such an asshole.
ryeguy@reddit
Isn't this kind of expected? LLMs accelerate coding. But writing code is just one aspect of running a business, even if the product is a technical one (saas etc).
As pointed out above, we can see the effects by the influx of vibe coded apps - so the impact of quicker code turn around is plainly visible.
You are asking where the LLM-powered ubers and facebooks are - but those are full blown businesses that have more than just straight code problems to solve, which means the overall productivity increase they get from LLM usage is a smaller chunk overall. I don't see this as contradictory at all.
Ok-Entertainer-1414@reddit
Well, that's kind of my point, is if there's a rate limit on how much code there is to be written, then a coding speedup doesn't translate to an increase in business value.
But it always seems like everyone is talking about LLM coding efficiency gains like they are a direct increase in the production of business value
Whitchorence@reddit
I mean, is it though? Let's say, theoretically, you can do the exact same job with 80% or 60% of the staff. Is that not significant?
Ok-Entertainer-1414@reddit
Yes, but we would see evidence of that too and I don't think we have. If a single SWE can produce more business value than before, there should be more demand for SWEs
tommyTurds@reddit
No? It means less demand because you can accomplish the same thing with less.
There’s a finite amount of work to be done on any product and just adding more software doesn’t do anything at a certain point
Ok-Entertainer-1414@reddit
We're not even close to filling the finite amount of software that can be built. Does every business have its own bespoke software? Does every person have infinitely finely detailed control over how their computer's software works?
tommyTurds@reddit
You don’t need to fill the finite space if all software. That’s stupid. You only have to fill the space needed for that specific business with is much smaller (and ever shrinking as the big companies gobble up subsidiaries)
Ok-Entertainer-1414@reddit
Why? No business is a carbon copy of another. Why should the business conform itself to the demands of a general purpose software, rather than each business having software whose functionality is exactly determined by the needs of that specific business?
tommyTurds@reddit
Lolololololololoollllloooolllolololol
Whitchorence@reddit
I mean, that may be true in a vacuum, but we don't live in a laboratory. Modulo AI we'd probably be in a recession right now due to war in Iran, tariff wars, and a bunch of other stuff that has nothing to do with whether AI works well or not.
tommyTurds@reddit
Literally some of the most valuable new companies are almost entirely “vibe coded”
They just happen to all be AI companies as well because that’s the hip market.
rwilcox@reddit
100%
Is there even an increase, on this platform, of vibe coders looking to validate their ideal? YES
Have I installed hot new apps on my phone this year because the zeitgeist said I needed to? No
mmcnl@reddit
That's the real benchmark.
ryeguy@reddit
What metric would you expect to go up if the overall speed boost is 20%? You can't even directly map shipping speed to revenue, at least not 1:1. And even if you could map it to revenue, how could you isolate it?
UnderstandingAny5314@reddit
idk. most of what we want to do with software has largely been solved many times over. why would resolving them even faster make much of a difference?
honestly this industry is kind of a farce in general. software shouldn't be an industry, we don't need to constantly pump out software like we do physical products.
NoUniverseExists@reddit
Except that, for some reason, people with huge amounts of money think we do need more and more softwares for infinitely many purposes that anyone have ever asked for.
UnderstandingAny5314@reddit
they don't seem to realize that were creating more problems that we're solving at the moment. for every new system we build that solves the same problem, we have more problems with making them interoperate. and that issues expands exponentially with all the redundant systems we build.
kaeptnphlop@reddit
It’s an opportunity to modernize a lot of old business software that still uses obsolete technology and lives on Bob’s computer that was subsequently made “the server” because it only would ever correctly work on his Windows 98 machine …
thekwoka@reddit
It's part of how some of these tech companies just get so big and worse.
Instead of solidifying a solid product and refining it, they keep growing in people, who then need projects to justify their employment, and the focus gets too messed up on shipping new things, not maintaining iterating old things.
yojimbo_beta@reddit
If there's no revenue and there's no software and there's no quality increase and there's no productivity revolution - why are we doing all this again? And more importantly, where is the money for the buildout coming from? Because we (all of us, our whole industry) can only afford this by selling even more software than we did previously
Abject-Kitchen3198@reddit
I can definitely notice the 1% overall improvements caused by 20% increase in developer productivity.
sp3ng@reddit
Improvements like Github now sitting at only 1x 9 of availability?
Abject-Kitchen3198@reddit
9s are overrated
hippydipster@reddit
8 is the new 9
thekwoka@reddit
Surely you have metrics guiding your decisions of what to ship, no?
tenthousandants44@reddit
Why don't you ask a booster that same question?
terrany@reddit
We've gotten tons more new features in 2024-2026, like banning account sharing, more ads per minute and being able to buy Prime items during a movie/TV show, and AI generated Coke ads!
SawToothKernel@reddit
Unemployment is flat to up, but the economy is going great guns. Seems fairly obvious AI is keeping us afloat.
Tyhgujgt@reddit
There is a ton of new noise in "build in public" or "side project" type of communities
Ok-Entertainer-1414@reddit
Yeah, but that noise is old enough now that some of them should have turned into real businesses that we could point to and say "they used LLMs to build this really fast"
Whitchorence@reddit
Every established company is using them too and the financial environment doesn't favor throwing money at startups the way it did in the recent past, so I'm not sure that follows.
Ok-Entertainer-1414@reddit
Well, startups should need a lot less money if they don't need to spend as much on expensive SWEs
Whitchorence@reddit
But not no money and a lot of ideas that would get funded in the past never make it past ideation now.
Ok-Entertainer-1414@reddit
But still, literally zero big success stories? People have been hyping up productivity gains since GPT-3 came out in 2021. I'm tired of hearing "this time it's different now." Show me the macroeconomic effects or it's all bullshit!
Whitchorence@reddit
Well, you're free to believe this of course, but considering how many people I talk to are barely writing code themselves anymore I find it a bit hard at a certain point to believe that it has no effect on anything.
Ok-Entertainer-1414@reddit
People in real life, or people on the Internet? Cause I see a huge disconnect between the experiences people claim to be having online, and the experiences people I know in real life are having
Whitchorence@reddit
People I work with?
Ok-Entertainer-1414@reddit
I personally at work have observed a big increase in the number of people (including myself) talk about using AI to code... And no meaningful change to our collective output
sorte_kjele@reddit
I think you need to entertain the idea that your own experiences may not encompass all possibilities.
Whitchorence@reddit
People are clearly feeling emotional about this stuff and just downvote anything that suggests maybe AI is not a completely pointless passing fad that will not change their jobs at all.
Whitchorence@reddit
I'm seeing more like I commit to a number of tickets in a sprint I think is doable but ambitious but actually I have nearly the sprint left and they're done. I mean there are all kinds of external factors that could explain different results but I am seeing real gains is my point.
Eskamel@reddit
Its very easy to tell when people stop writing code because their output drops in quality very quickly afterwards and it never goes up even after they claim to feel it does
Whitchorence@reddit
Yeah, OK. Sure. Not everyone is a master of the craft as you are.
BusinessWatercrees58@reddit
My company recently finished a couple projects a lot faster than we would've for a similar sized project when I first started (pre LLM). It's an internal app for another company though and we didn't advertise out AI usage, so there's no way of anyone knowing about it. We got paid though.
You have to figure there are lots of other projects like this. There's lots of B2B software out there. You're expecting to see consumer level products, but those are going to be harder to find because consumer products have a harder time succeeding due to non-technical problems. You're looking in the wrong place.
Ok-Entertainer-1414@reddit
No I'm not. And there hasn't been an externally visible change in the amount of B2B software being released, either.
Broad changes to the efficiency of an entire industry surely must be externally visible in some way. Some of it is of course going to be invisible to the public. But not all of it.
BusinessWatercrees58@reddit
Why would much of that change be externally visible?
hippydipster@reddit
Its all over github and the rest of the internet and its largely useless to you because the people who made it made it specifically for themselves.
muuchthrows@reddit
The problem is that if even you speed up the current bottleneck by 1000% you won’t get a 1000% speed up, you’ll just hit the next bottleneck. From what I see and hear that bottleneck is now product, design, business model and market fit.
These areas have never had to optimize their workflows since software development was always the bottleneck.
thekwoka@reddit
Yup, supposedly everyone is more productive, but the things we use aren't seeming to get so much better so much faster.
We kind of see the opposite, but that's also a management issue.
kbielefe@reddit
My guess is the new software feels invisible because it's mostly AI-related software. For example, the coding agents are mostly all AI generated at this point.
I think a lot of it is also showing up in internal or quality of life type things. Sales people generating leads faster, that sort of thing. My semi-technical manager creates tiny throw-away apps like prototypes or visualizations all the time.
I also think the total time may not have changed much even if the active developer time has. In other words, you have more downtime while you wait for your agent, and that feels better even if it doesn't translate directly into shipping more.
tenthousandants44@reddit
They spent a trillion dollars on internal QoL things? Do you not understand how capitalism works?
EmptyGuid@reddit
Most of the software being produced is still in the enterprise world that you will never see or feel in anyway. And enterprise world is in some (or most) cases really slow in their moves.
SW industry is like an iceberg, you only see or hear the sw vibed by the loudest tech bros but the reality is completely different under the hood.
tenthousandants44@reddit
The question is why are they spending a trillion dollars to go nowhere
akkaneko11@reddit
Ehhh I mean didn’t the inflection point happen like September last year? It’ll take a sec.
Fwiw the winter 2025 y combinator class showed the fastest user and revenue class of any class ever- and they said 95% of code is generated.
https://www.cnbc.com/amp/2025/03/15/y-combinator-startups-are-fastest-growing-in-fund-history-because-of-ai.html
Ok-Entertainer-1414@reddit
YC/its owners obviously have a financial interest in having people think that AI is amazing though. They're not really a trustworthy source (especially given the Sam Altman ties, but even without that)
akkaneko11@reddit
I mean sure but I kinda feel like there’s essentially no source where people would be satisfied in that regard
HatesBeingThatGuy@reddit
"A highly integrated product attached onto a company already deeply in the space is fully AI coded" is far different than a startup.
Ok-Entertainer-1414@reddit
They would lie and say it was "fully AI coded" even if it wasn't. They couldn't even sell it with a straight face if they said it wasn't. So that doesn't reveal anything about whether it actually was
SmartCustard9944@reddit
Not as easy to track as you think. It’s not just about writing code.
Ok-Entertainer-1414@reddit
"Having a visible effect on the broader economy" is easy to track. Steam engines were unarguably useful to the economy. The internet was unarguably useful to the economy.
UnderstandingAny5314@reddit
most of our software productivity is spent on the fantasies of middle management anyways, very little of it sees the light of day, and that which doesn't usually doesn't really change much, since most of what we're trying to do is fundamental pretty simple (even if we overlay really complex systems on top)
itix@reddit
I think it would be more like less devs needed.
Ok-Entertainer-1414@reddit
There's not evidence of that in the macroeconomic trends either. We would notice a big increase in demand for devs if a single dev could suddenly produce a lot more output
itix@reddit
Right, I forgot about Jevons' paradox.
the_pwnererXx@reddit
Faster with ai doesn't mean people are doing more work! It means they are saving their own time!
adelie42@reddit
Almost like new skills have a learning curve.
djnattyp@reddit
Almost like shills need a new snake oil to push after web3/NFTs/crypto.
ADDSquirell69@reddit
Experienced developers are not using it to write their code. They're using to save time on routine tasks and as an automated second set of eyes you can ask questions to.
ReDucTor@reddit
Experienced devs are definitely using it to write code, I have used it for lots of things and its definitely a productivity improver.
My main usage is building new tools for improving development processes, for example I have built automated refactoring tools, automatic linting tools, better scripting for Visual Studio, and much more. However I am starting to use it for production code, and have found that if you have it build out a strong plan and ensure it is good then implement that plan it generates pretty good code.
maseephus@reddit
Not sure why you are getting downvoted. I think some people just have their heads in the sand and don’t realize that there are ways to improve the outputs from an LLMs. Seems that they have a lack of experience with them
maseephus@reddit
Could not be more untrue. Anyone writing code by hand is behind the curve. Just about everyone at my company writes most of their code with an LLM. Mid level, seniors, juniors, staff engineers. In fact it is the staff engineers who are the biggest proponents
ADDSquirell69@reddit
Most people that have been coding for a long time have a base of code that already exists that they likely modify and reuse over again in new projects.
AndyLucia@reddit
At most modern tech companies they are absolutely using AI to write code. That ship has sailed.
HatesBeingThatGuy@reddit
Yup. I have coworkers who are serious about "I haven't written any code myself in 6 months". Guess how many more stupid bugs I have had to find?
SmartCustard9944@reddit
And organizing data, making plans, logistics, brainstorming.
Code generation is a very narrow use case of it, and even then, there is huge value in experimental throwaway code.
Whitchorence@reddit
Yes they are
CardinalHijack@reddit
This, 100% this.
Many-Working-3014@reddit
Seems reasonable, yet my bosses think this number is going to be 900% by the end of the year.
Tyhgujgt@reddit
Management is the easiest to replace with ai.
fallingfruit@reddit
Not true. AI is only good at coding because of the quick feedback loops on correctness, simplicity for RL, an insane amount of training data, and the ability to basically throw code at a problem like thousands of of semi-retarded monkeys on typewriters.
What we have now is basically the best of what LLMs can achieve at tasks that they are well suited for.
I take solace in that because I'm glad AI is not ruining all other industries and my kids might still get to think in the future, even though it has completely ruined mine, for now.
Tyhgujgt@reddit
The counterfactual is human managers. They have exactly all the same issues plus ego and incompetence.
AI knows how to handle every single human interaction without falling into common traps that every single mediocre manager falls to.
It won't replace great manager, but those are extinct breed
fallingfruit@reddit
An AI manager would be absurdly easy to abuse and coerce into doing anything you want. From above or from below. This idea is silly at best.
Tyhgujgt@reddit
Eh we are don't have fully autonomous coding agents, why would you talk about fully autonomous manager agent
SmartCustard9944@reddit
I like how everybody repeats this just because it makes people feel better.
Tyhgujgt@reddit
The adoption of AI today is mostly voluntary, coders are the first wave because yet are used to learning new tools and can quickly adap to a new workflow.
Managers with engineering background already started using AI for roadmap, planning, analysis and everything in between.
rocketonmybarge@reddit
unfortunately any startup promising to replace management with AI will get ZERO funding.
gered@reddit
To me, the far more interesting data is what these reports show:
austinwiltshire@reddit
The whole intro here explains that due to changes in recruitment, they're not sure about their estimates in 2026.
Notably, they reduced their payments per task from 150/hr to 50/hr which is gonna get more junior devs in their study.
allllusernamestaken@reddit
My company did this analysis. We have about 800 engineers so there was a decent amount of data to work with.
The analysis showed that junior engineers had the largest increase in number of PRs opened after adopting AI tools. They found strong correlations to the increase in PRs to the increase in AI tool usage. Senior engineers did not see a comparable increase in PRs, even if they had comparable increases in AI generated code (measured by token output).
Vivid_Fan9346@reddit
The non-charitable reading of your company's results is that junior developers are flooding the zone with PRs that others need to spend more time wading through. Given the increased token spend from seniors as well then they may simply be spending more time reviewing both the code that their agents wrote and the code the agents from juniors wrote.
Regardless yeah, it's unfortunate that there was no further research.
allllusernamestaken@reddit
We bought heavily into AI so we have everything - Cursor, Claude, Codex, Gemini, Roo, Goose - and are letting people experiment with all of it, quantify and qualify results, and keep what works.
Our next "how you use AI" survey will be coming out soon that should add some more details to it. My assumption, based on my own experience, is that Seniors are most likely spending their tokens on things other than code. Design docs, runbooks, searching code, etc.
As an example, we hooked up Claude to all of my team's repos on Github, Figma, and our Google Drive with design docs, partner API specs, etc., and then connected it through a Slack bot so everyone can ask questions about anything related to our product and get a pretty good answer.
HazelCheese@reddit
I am noticing this at work. We have graduates opening 4 prs a week when normally they would need help with 1.
It's jamming our sprints up because they still need guidance in the code review but now they are pulling 4 developers off other work to look at the code and try to help them.
thekwoka@reddit
The biggest thing would just be that "PRs opened is not really a sign of actual productivity" for many things.
Obviously, if their work is mostly that kind of "somebody gotta go do the thing" type of work, then that's fine. Like the impossible to screw up but you still gotta check the box.
subma-fuckin-rine@reddit
thats why i check in lots of bugs, the amount of PRs i open to fix them is off the charts
oupablo@reddit
Senior engineers probably saw a net decrease in PRs because they now have to spend all their time reviewing the uptick in PRs created by juniors.
new2bay@reddit
Either that, or they’re abandoning human code review entirely.
Future_Manager3217@reddit
Yeah, I would not read the 2026 update as "AI is now +20%".
The more useful measurement is the full delivery loop: implementation time, review/test/rework time, and whether someone else can safely change the code a week later without reconstructing the AI session. A lot of the claimed speedup lives in the first bucket and gets paid back in the last two.
Sufficient-Wolf7023@reddit
Its really an impossible thing to make broad claims like that about.
Like if I'm just starting from scratch to build a small, simple app that has been built 100 times before - yeah it can totally speed me up 300x, or just make the entire thing without me. If I'm working with an enormous codebase that I have a great understanding of through working with it all year, but is full of strange code, obscure variable names with out-of-date documentation it will probably make things worse.
Noblesseux@reddit
Yeah the study literally says that these new numbers are likely totally unreliable. So drawing conclusions from the new one is kind of unscientific, and the people replying in here are largely replying based on sentiment rather than like...data.
Like at several points it literally says that people rejected doing tasks that they think AI wouldn't be able to quickly solve for them and the suggest that it might be because they're paying a lot less. Thus the study isn't really going to reflect any task AI can't do well because people aren't willing to do those without AI for $50.
oupablo@reddit
Drawing unscientific conclusions from a paper, sounds like something AI (and politicians, and wall street) would do.
Seriously though, how do widen that pay rate that to cover entry level engineers in SF and not expect it to skew the results. I bet if you ask Senior Engineers and Junior Engineers if they find AI useful, they'd give very different answers. Juniors don't know what they don't know and AI is more than willing to act like an expert on things while doing it wildly wrong. A senior is going to be more critical of the work produced by AI while a junior will be much happier to just prompt it and accept whatever it produces.
gefahr@reddit
The first study wasn't any good either. Neither are remotely scientific. Orgs should (and big ones are) do their own evaluations.
Noblesseux@reddit
Better than this one lmao. It at least agreed with other similar studies and didn’t have the little problem of “we couldn’t get people to actually engage in the study”
gefahr@reddit
It depends. If you're using the study because you need something to cite to reinforce your priors, then it's perfectly suitable for that. If you wanted a study that sufficiently explored the hypothesis in real world conditions, then no. Neither study is worth the time it takes to read it.
DotEmbarrassed2972@reddit
"Developers were experienced open-source contributors with median 10 years experience." - METR.
W17K0@reddit
I'm definitely faster with ai,
I can link a ticket and by the time I've even read through it, it's already given me a synopsis and done the work, ready for me to review.
Although it 100% isn't like that for every ticket, it required guidance, and you guiding it to making the correct architectural decisions, and updating agent files.
It's a new way of working, ofcorse Devs that have done it one way aren't going to enmass adopt the new change. But it's clear they will be forced to in the near future.
seven_seacat@reddit
So what on earth are you bringing to the table, then?
W17K0@reddit
Direction, architecture guidance, reviewing, and leaninging into even more of what a senior / lead role was, dictating more direction, scope, product within the company.
seven_seacat@reddit
If you're doing all that... you're not a ticket pusher.
W17K0@reddit
Anyone who is only doing tickets is being left behind. Or is junior / mid
maxip89@reddit
metr partnership with:
openai, antrophic, amazon.
well well well.
How good is the "study"?
fallingfruit@reddit
20% speedup for a bunch of juniors sponsored by the people that desperately want the speedup to be true. Honestly that's a failure because I heard that I'm supposed to smash though tickets 20 times faster.
theguruofreason@reddit
Definitely. They clearly hated the 2025 study, so they gamed this one.
ryeguy@reddit
Is there evidence they partnered with them on this study?
maxip89@reddit
they are partnerd in the org.
What do you expect?
ryeguy@reddit
For you to say something like this, I'd expect them to disclose the partnership or sponsorship on the study itself.
What are you even looking at? Did you just skim the about page and pattern match on company names? If you read it it's much less damning. It says "we have previously partnered with OpenAI, Anthropic, and other companies to pilot informal pre-deployment evaluation procedures. These companies have also provided access and compute credits to support evaluation research." It then has a whole section on Funding, and none of those companies are listed in it.
maxip89@reddit
Here is the question:
How far can you trust any study that is payed or partnered from the interest group?
This everyone has to answer by itself instead of posting some "results" into the wild.
Otherwise its just marketing in some research clothing.
ryeguy@reddit
> How far can you trust any study that is payed or partnered from the interest group?
This is now my third comment asking where you are getting this information. What partnership are you referring to in this particular study? I see no mention of it anywhere. They disclosed their involvement in *past* studies, where is the disclosure on this one?
maxip89@reddit
again, org are partnered.
it IS in the study.
ryeguy@reddit
you're so frustrating to communicate with. good day.
maxip89@reddit
same to you, have a nice day :D
gigastack@reddit
Honestly, these are rookie numbers - I am close to 10x, by objective measures (PRs / features / lines of code, jira tickets, etc).
TheBoringDev@reddit
I feel bad for your coworkers.
retteh@reddit
With trad AI I was probably net neutral or worse. With Codex I am 20x faster. I started using Codex in 2026.
new2bay@reddit
I noticed you left out the many flaws of the 2026 study that the researchers themselves pointed out. In particular, there are massive selection biases in the problem set and participant groups.
BTolputt@reddit
They also stated the results of their new study are also unreliable. If we take them at their word the last results could not be trusted, then we kind of have to take them at their word their new results cannot be trusted either.
I don't know, nor am I arguing about, of AI is a net benefit or detriment to development speed here. I'm just pointing out that one cannot just cherry pick which study to accept if both are stated as unreliable.
anengineerandacat@reddit
20% sounds about right, we ran a year long project with heavy AI usage and estimated the project before we looped in the AI tooling.
Across the entire project we were about 26% more effective based on velocity.
Not all of it was coding related though, planning efficiencies, research efficiencies, and automation thanks to AI.
Most of this on Kiro and Claude Sonnet 4.6; next project will try to go a bit more heavy with a sorta Jira to code agent where we plan everything into a Jira ticket and use that as the spec but there is an upfront cost to that we have to figure out how to report.
Dry_Author8849@reddit
You will find way more interesting the results of the SWE bench pro (2026) leaderboard
Read and take your own conclusions. The pro version of the tests show the best models with 46% accuracy. So 56% chance of wrong code.
It's interesting they have a "live" version of the tests where the models score 81% of accuracy that they deem "contaminated". Check it out. Setting up tests for LLMs accuracy is not a walk in the park.
Then think about SWEs abilities to detect those inaccurate code results.
It's an incredible tool that needs close monitoring if you work in complex environments or deal with a complexity bordering the context limits.
It's worth the read.
On the other hand the study you posted should never been published. It's a waste of time.
Cheers!
nasanu@reddit
I still do almost nothing each day waiting for others to unblock me. Ai won't change that.
gdinProgramator@reddit
Please provide context that shows this is utter bullshit. I will do it for you. Taken from the link:
So the new study is structured around hard core veterans that use AI just to boilerplate the code. Which explains the speed up as AI becomes exponentially dumb and eventually useless for any coding reasoning tasks.
SadSongsMakeMeGlad@reddit
Collaborating with an AI agent while coding has saved me hours of time I would have normally spent researching solutions to everyday problems. For that reason alone it’s earned its place in my arsenal. I can give real-world examples if you like. It helped me immensely just a couple days ago. But this is using it as a glorified search engine, which it can excel at.
On the coding side, it allows me to work at a higher abstraction and iterate quicker. I can see the quality of my work has also improved since moving to Claude Code at the beginning of this year. I am writing more comprehensive tests and developing features to a level I would not have the time for in the past.
They are not perfect, but the benefit has been undeniable for me. Any variance in the speed of the work seems almost beside the point. I’m not really sure what they’re measuring is what counts.
The only problem I have with AI at all is that I don’t want my tools to be owned by a corporation. The future I want is owning my own LLM for coding work, just like I own a MacBook.
theguruofreason@reddit
If your code quality improved by using Claude...
Yikes.
HazelCheese@reddit
Eh it's the difference between cooking for my friends vs being a chef at a pub.
Handwritten from scratch code is more bespoke but most people are also happy with pub food.
You don't need the best code in the world unless you are working on stuff that requires near pefect performance.
SadSongsMakeMeGlad@reddit
That is very true, but not what I said. I said the quality of my work improved. My work is not the code itself, but the software product. I can admit the quality of the code is not always the best, but I refactor when necessary. I have to say though, the more the code quality it produces is getting better all the time.
Aggressive-Exit8195@reddit
I’d love some real world examples
- a confused mid level dev that can only use AI for personal projects since work banned it
SadSongsMakeMeGlad@reddit
We are integrating IP cameras into our app, using software that runs on a Raspberry Pi to upload mp4 clips to S3. For some reason, the clips worked everywhere, except iOS devices. And we could not figure out why.
I told Claude and it immediately identified that the tool we’re using to capture the video is likely using ffmpeg behind the scenes to transcode and clip the video stream. By default, ffmpeg tags HEVC video with “hev1”, which AVFoundation will not play. Instead, it requires the “hvc1” tag. It then provided me several ffmpeg commands I could use to deduce if that was the problem and then an example command how to re-tag them.
Now, we would have figured that out eventually, but it might have likely taken a good amount of time to put it all together, in what Claude provided in seconds.
apricotmaniac44@reddit
I built a dynamic code loading mechanism for microcontrollers (arm cortex m0+) which is like... a worse version of ELF bur works perfectly for our case (loading plugins at runtime using bluetooth). All with the help of Gemini in couple days. It especially helped a lot on objdump output and interpreting the machine code instructions etc. Would probably take much more if it wasnt for Gemini
Doctuh@reddit
Same it found a gnarly bug with some subtle race timing that I couldnt find for weeks.
Then in the next session it tripped over its dick importing from the stdlib.
🤷♂️
raddiwallah@reddit
I had to use some internal repo to set up some containers for my testing. Im familiar with docker, ssh, SQL but the repo had a lot of domain specific code the other team owns. I had to simply get it up and running. I literally let Claude rip on it, handle the fixes, deploy my containers and heal them when required.
I was also able to improve the health check, contribute back to the internal repo because I knew a feature was missing.
Without LLM, I’d be spending days parsing logs seeing what to fix. LLM did not magically fix it for me, I had to tell it to tail the logs, observe and fix.
That was an insane productivity boost for me.
coldblade2000@reddit
We have a complex codebase with many feature-flags, as we interface with multiple external systems, most of them legacy. It's WAY faster to just ask the AI why such-and-such weird rule is there, or where the appropriate layer for making a given change would be, instead of investigating it yourself. Those minutes you save build up, and it makes onboarding much quicker.
WhateverHowever1337@reddit
why is it banned at your job?
Scottz0rz@reddit
Usually it's security purposes for certain industries that you can't share company data with third parties, ie: government, healthcare, infosec work.
Also if the company is cheap and not paying for a real license, using unauthorized AI means your data is being shared without knowledge.
The other main reason why I see AI use being 100% banned is my friends who work in the video game industry. Gamers hate AI and will froth at the mouth when the word is mentioned, so any accidental AI use is explicitly banned at some companies just in case an AI dev-test texture leaks into the game.
There was this whole big drama about Clair Obscur: Expedition 33, which won Game of the Year last year, where people got irrationally mad about some AI stuff.
https://www.polygon.com/game-awards-expedition-33-disqualified-did-it-use-ai-response/
gefahr@reddit
The gaming industry is in crisis. It takes like a decade and hundreds of millions of dollars to ship AAA games right now. It's a couple external factors away from a huge correction that will cost a ton of jobs, unfortunately.
Anyway, my point was, once that happens.. I predict they'll all use whatever makes things faster. Catering to gamers' feelings about AI is a boom-time phenomenon only.
Scottz0rz@reddit
As both a gamer and (non-game) developer, I'm torn about it to be honest. The absolute hatred towards AI from gamers is honestly pretty justified. But, yeah you're still right.
For the gamer POV though:
AI data centers have vacuumed up most consumer hardware supply, and those supply constraints have made video game consoles and PCs more expensive. It's made both gaming and PC building inaccessible to a lot of people.
I saw the AI boom affecting PC parts and bought a new prebuilt computer on sale for $2200 from Costco last year. The same prebuilt configuration is now around $3800 less than a year later and ~$3300 if I parted out similar components to build it myself, last I checked.
The AAA industry has positioned itself about cutting-edge technology and requiring the latest hardware, but the latest technological advancements in consumer graphics are purely AI-related features with DLSS, super-sampling, frame generation, etc. On top of the bloated budgets and dev time, if consumers can't afford the new hardware and economy downturns, yeah, AAA is a bubble waiting to pop as well for a decade.
Beyond that, the specific aspect of generative AI for art assets is particularly unethical and takes away the "soul" of a video game, which is meant to be an artistic medium. Ultimately, core gameplay ideas and art should come from humans if games are meant to be an artistic expression.
However, not all AI use is the same.
You're still not going to convince all gamers to not hate AI since it sucked up consumer hardware supply making the hobby expensive, and I do think that generative AI art is gross, but as an internal productivity tool and not a shippable consumer-facing slop, that's the value to me.
Having an LLM coding agent help isolate and find a weird engine bug that only happens on certain AMD GPUs or on ultrawide resolutions is a valid use case of AI that helps game developers, who are notoriously overworked and burnt out. There is no artistic expression in accidentally writing a bug, unless we're talking about something like rocket jumping in Quake and Doom or Gandhi nuking you in Civilization, I suppose lol.
Anyway TL;DR - like everything, it depends. I don't like AI as a gamer, and I also still don't love AI as a developer because I recognize the technology is ripe for abuse, has major negative environmental/economic impact, and currently is barely regulated when it very much should be.
biosc1@reddit
Some of my main daily tasks are run with ai:
grab the task from Teamwork. Analyze the task, review the code base, build a plan, summarize the solution with thr files that would be touched. Generate a time estimate based on a generic senior dev level. Post back to task as a private comment. Repeat for all new tasks (which I have set up to be tagged as "estimate needed". This runs every morning or on command. Lets me roll into work and have a summary of stuff to do. Then I'll review the summary and adjust the quote. It gives me a great starting point.
for every piece of work run with Ai, it generates / keeps track of a testing plan whether it be an automated test or manual test (with checklist for me to review)
I like to use the summarizes to have it do a test run of the proposed solution. Most of the time, the changes are minor and I just let it do the change.
Basically, use the tools to save you time. I don't have to search for the file where I'll need to change / update the code. The tool will find it for me and point me in the right direction.
I don't have to review new tasks because my tool does it for me and summarizes it for me. I even have it send me an email with the summary and links.
It's especially helpful for minor stuff like "change the wording here", etc. I just let those run on their own and review the PR. I have different skills created for "planning" or "just yolo it" and items in-between where I need to get my hands dirty but want assistance in finding the code and how it's used.
These tools are especially helpful in my agency environment where you are thrown new clients rather quickly and need code analysis.
ParamedicBig514@reddit
That sounds way better than having to dig through old tickets yourself.
Notary_Reddit@reddit
My team owns just over 1000 names Go tests. About 3 weeks ago the test suite ran in 360s. In about 2 hours in an afternoon with AI I generated a blueprint for how we could speed the tests. I handed off the plan to a jr engineer and every day since he has made 3-10 tests faster. Removing sleeps, reducing counts, and having better dependencies. I spent 3 more hours Friday double checking what was left. The test suite runs in 45s now. That's 5 minutes of savings every time anyone on the team runs the test suite.
I figure he was at least 3x faster with AI than without. We should be "even" on time spent in about 2 months without counting for the couple bugs he found along the way.
eliquy@reddit
This is my experience with Claude Code b exactly. I'm actively working on 3 major projects, plus maintaining legacy projects, across multiple languages - frontend, backend, IAC, CICD, tests, documentation, AWS+Azure, Web and mobile - all of this over the last 4 years and there's no comparison, since about the start of this year with Claude code and Opus my productivity and quality of output is a step change beyond anything before.
The LLM does the grunt work and I can focus my time on reviewing and guiding the system, ensuring all of the non-functional requirements are covered, and communicating with stakeholders, and all of it happening at a rate that is easily 2x faster than before.
These tools are incredible when used appropriately and effectively by experienced developers
SmartCustard9944@reddit
Claude 4.6 level of intelligence is the real unlock.
SmartCustard9944@reddit
It is already kind of expensive if you want to do any effective work. GPT 5.4 and Opus 4.6 seems to be the intelligence threshold for meaningful reliable agentic work, looking forward to cheaper open models that will reach that level in the future. DeepSeek V4 Flash, for instance, is super super cheap, but also quite annoying to use and forgetful.
Whitchorence@reddit
There's a definite lower threshold to "fuck it, why not?" kind of stuff but I guess you could either point that at quality improvements or just tossing more poorly tested features out there.
Redalb@reddit
Running Qwen 3.6-27b locally with OpenCode is nearly the same experience as claude code for me. A little slower on my hardware (32gb ram, 4090) than claude but fast enough to not be an issue. My next macbook will likely be one with 128gb of ram. Would be able to load huge/multiple models with that much memory. You also have things like SpectralQuant that are making it easier to run these llms with large context windows.
SadSongsMakeMeGlad@reddit
That is great to hear. I have done some research into this, but haven’t had the money invest yet. And they will only get better.
Scottz0rz@reddit
That's kinda what tools like Ollama and LM Studio are for: running open-source or open-weight LLMs locally on your machine.
I've not really played with the different coding agents, since I have a personal Claude Pro license I'm abusing while it's cheap and subsidized, like you said, and I want to know how Anthropic's tools work since that's what my work's enterprise license uses, so it's expedient to know how to use the same tools.
I have my old spare PC that has an RTX 3090 in it with 24 GB of VRAM, and the local model coding agents have web-search and other tool support these days, and I expose the Ollama port on my local network, so all my devices on my home internet can see it.
My work Macbook Pro has 128GB of RAM since they have the "unified memory" that shares it equally, so you can load a really beefy model to do coding tasks onto that. I'd definitely consider that a real possibility for companies in the future wanting to leverage AI for coding.
Especially when you think about it - the real use case isn't just saving money but for privacy-sensitive/compliance use cases where you can't legally share your code/data with a third-party. Healthcare, security, government work might really be able to leverage local models on company devices or ones that are hosted on-prem on company servers.
In theory, you can take an existing open-weight model and then feed it extra training data on your own codebases, knowledge bases, internal style guidelines, etc and then have that usable for employees.
... probably - I don't know much about this crap, but I'm learning because that's kinda my job to learn how whatever new stupid shit works that leadership is trying to shove down my throat lol.
DotEmbarrassed2972@reddit
"When surveyed, 30% to 50% of developers told us that they were choosing not to submit some tasks because they did not want to do them without AI."
Sounds like there's a potentially infinite speed-up for 30-50% of developers who now suck so bad that they cannot perform certain tasks without LLMs. This phenomenon was not apparent in the initial study.
I wouldn't take the findings as being all that meaningful though, as METR comes right out of the gate saying that their study is flawed.
TehBens@reddit
Not surprising. Claude Opus 4.5 was a turning point, because it was signifcantly better at coding. I try to avoid their other models for coding. They are sometimes okay for small python tasks, but in general Opus outperforms by a large margin.
SchemeMaterial2877@reddit
Could be true, because AI got better. In 2025 it was still shit for coding, basically wrote code which was not working and often invested things which didn't exist. Now it's actually pretty useful and can even do changes in complex code base, it's not AI slop anymore.
But reading comments of this sub seems like most people are in a denial, this probably because one has build all their personality around coding.
Rascal2pt0@reddit
For the simple crap that would have previously been boiler plate it’s absolutely faster. It excels at the edge of the system. I find it has more issues the deeper into a system I go and where things are novel. When I have it work on complex areas you sometimes have to just give up after a bit of prompting and roll up your sleeves as it were.
Heavy-Report9931@reddit
all I can say is it has definitely bolstered my knowledge in areas where I am weak. Computer networks?
went from "whats a switch?" to building out a 10gb internal home network. using vlans to segment the network. for iot, home, homelab etc.
the other day implemented my own connection pool logic to undertake how the hell it works and claude walked me through it.
I let A.I tech me. I Write code but I let it asses if the code I wrote is ass or not.
then iterate and if I don't know something. I ask it how is this done?. its been such an amazing help in learning.
Ok-Shower6174@reddit
20% faster at writing code, 40% slower because we are arguing with a hallucinating LLM about a missing semicolon.
sayqm@reddit
Still irrelevant. They just asked people "are you faster with AI?"
Homelander-30@reddit
I disagree, we recently developed a Networking application and our company asked us to heavily use AI to generate the code plan and write the code. Despite providing multiple references, the output was not as we expected it was to be. The code generated by Opus had lot of bugs and sometimes the code will not follow the Architecture we proposed the LLM to follow.
It took us nearly 3 weeks to get the MvP working but we were working for around 12-14 hours a day to get that things running. I do agree that it kind of saves our time from writing code but the debugging and fixing the bugs took a lot of time and i felt I could've written the code myself.
davearneson@reddit
They stated that the results of their current study are unreliable not their old study. You got that arse about.
Michaeli_Starky@reddit
Our teams are 30-40% faster since adoption. The bottleneck is no longer in the writing of the code.
hypernsansa@reddit
It never was...
Michaeli_Starky@reddit
It always was.
itix@reddit
That is interesting.
hypernsansa@reddit
Skill atrophy is really something
raynorelyp@reddit
I know without question ai makes me slower. But I’m lazy and it’s fun. Anyone who thinks it’s making them faster doesn’t know time management.
hypernsansa@reddit
Exactly. Past a certain point laziness ends uo being more work than just doing the work from the beginning. Programmers are infamous for this.
symbiatch@reddit
Reading the “study”… It goes on and on how badly it was done. People couldn’t finish their work without AI (so not “experienced developers”), they refused to work without AI, they self-reported stuff on vibes sometimes, the whole cohort was 57 developers without selection for representability of the population…
So yeah. It means literally nothing.
And it original went with “must be paid at least $150/h” then that’s a huge bias also.
So I wouldn’t care at all what their studies say when they are this biased. Of course people who demand to use AI and can’t do their tasks without will be faster with AI.
Or did I miss something?
r0ck0@reddit
These studies that have shown people being slower "using AI" make me really wonder what exactly they mean by "using AI"...
Using it for what? EVERYTHING? Or just the things it's actually useful for?
Seems it could only logically be that if it's slowing them down, then it's the "EVERYTHING"... including stuff it's not useful for.
Completely depends how you use it.
And the more we try & fail with it, the more we learn what it is + isn't good at.
If it's making you slower... then just stop using it on the things it's slower/worse at.
Immediate_Rhubarb430@reddit
I always found it hard to believe that AI would make you slower in such an obvious way. If AI ends up having a negative impact, I expect it will be through accumulated damage in large code bases over long periods of time as the organization becomes unfamiliar with the core logic.
But even that seems a stretch
Healthy_Albatross_73@reddit
Add in measuring developer productivity has been impossible for years now.
Immediate_Rhubarb430@reddit
Amen
new-runningmn9@reddit
I’ve had this conversation with folks in my world that are all in on AI. They’ve published numbers on these massive improvements, but it’s unclear how they are doing the accounting. Their current workflow showed a substantial speed up - but only so long as you didn’t include any of the time it took to learn how to implement and build the system. My reservations mostly center on the fact that talking to them about AI is like talking to Scientologists. :)
Immediate_Rhubarb430@reddit
Yeah plus software productivity is famously hard to measure. Esp when you consider the long lifetime of most software. I take metrics either way with a big grain of salt
damnburglar@reddit
The allusion to Scientology sounds apt lol.
Published numbers usually have the intent to impress the shareholders and rarely reflect reality. Productivity always has and always will be cherry-picked numbers to show the boss/world.
x-jhp-x@reddit
It depends on the task, but sometimes it can be obvious. I'm a little curious to see if it has improved, but it failed to produce code that worked for a few tasks 1-2 months ago when I tried to use it. It would make up functions that didn't exist, and when asked to write the code for the function it made up, the solutions it came up with had no hope of working. I ended up putting together a few simple working examples & submitting feedback for them to improve it though, so hopefully it has gotten better.
If you're wondering was the task was, one example is that I essentially needed a more advanced version of this: https://github.com/nasa/QuIP/blob/master/libsrc/cu2/cu2_yuv2rgb.cu I used a more complex debayer filter, did some denoising, and added a tiny AI kernel to handle parameters & optimization of those. I didn't send in the eventual version I used to them though, just a few simple examples. Anthropic/openai can pay me to do the advanced work for them if I feel like it & they feel like it hahaha.
It also didn't seem to really "understand" math beyond a high school/undergrad level, and I also couldn't teach it new concepts, or have it read a textbook & then have it apply them. It is getting better, but it is also pretty limited. With most of the engineers I work with, you can hand them a textbook, have them go through the examples & read it, and then they'll be able to apply it to their work.
I am a bit curious about the languages/tasks that were used in the study. It also looks like they were only getting 60 devs to work on this, so I'd assume there is not a lot of variety. Their study also looks limited to open source projects, and I wonder if the AI they are using had those projects in their training set. A lot of my work is library heavy (npp, mkl, openmp, tbb, etc. etc.), performance is critical (the code needs to be near optimal in most cases, or it is near useless), and a lot of the work requires a deeper understanding & multiple things to keep in mind at the same time. Honestly, the last part was probably the most frustrating. It got many things wrong repeatedly because it gets worse or forgets things the more it has to know, or has to keep in memory. I'm sure that is fine for stuff like a webapp, but if you're trying to push the hardware to operate at its max, forgetting to follow one instruction from a long list means a complete failure. Obviously, I could break down what it needed to write by saying something like, "on line 36, do ...", but at that point, it is just easier & faster to write the code myself. I still use AI for simple/easy/repetitive tasks though.
Did the study detail outliers? Or was the study just limited to tasks that AI had a chance of accomplishing? Currently, I'm not sure LLMs will ever be able to hit the same level of 'understanding' most of my work needs, and it seems like someone needs to come up with something new or different. It almost seems like if the ai were able to incorporate some sort of visuospatial component, it'd do a lot better. A lot of my understanding of math comes from this. If you're wondering what I mean, 3blue1brown does some great visualizations, and has deepened my understand significantly.
DigThatData@reddit
whaddayaknow, people get better with tools the longer they use them. whodathunkit.
theguruofreason@reddit
They didn't like the results they got from an actually robust study, so they gamed the study to get the result they always wanted.
sp106@reddit
Axe experts with years of experience felling trees with axes would probably also be slower in their first year with chainsaws.
RabbitLogic@reddit
20% sounds about right from my estimation.
zeroconflicthere@reddit
IMO is that we're outsourcing code to AI and getting the productivity, but just spending less time at it.
Longjumping-Ad514@reddit
“Study”
Winter-Rip712@reddit
And the study convinently ignores the parralization of work that is easily done with Ai.
throwaway_0x90@reddit
They can make the data say anything needed to fit their narrative.
Southern-Cattle4038@reddit
From the link:
“ Unfortunately, given participant feedback and surveys, we believe that the data from our new experiment gives us an unreliable signal of the current productivity effect of AI tools. The primary reason is that we have observed a significant increase in developers choosing not to participate in the study because they do not wish to work without AI, which likely biases downwards our estimate of AI-assisted speedup. We additionally believe there have been selection effects due to a lower pay rate (we reduced the pay from $150/hr to $50/hr), and that our measurements of time-spent on each task are unreliable for the fraction of developers who use multiple AI agents concurrently. Based on conversations with study participants, we believe it is likely that developers are more sped up from AI tools now — in early 2026 — compared to our estimates from early 2025. However, because of the selection effects in our experiment, our data is only very weak evidence for the size of this increase.
Our raw results show some evidence for speedup. Our early 2025 study found the use of AI causes tasks to take 19% longer, with a confidence interval between +2% and +39%. For the subset of the original developers who participated in the later study, we now estimate a speedup of -18% with a confidence interval between -38% and +9%. Among newly-recruited developers the estimated speedup is -4%, with a confidence interval between -15% and +9%.”
They cut the pay, couldn’t find enough people to do the new study, and guesstimated a new result that doesn’t show a statistically significant improvement.
brainrotbro@reddit
I’m def faster than AI. But I can imagine advances in agentic models that would eventually make them faster. For now, though, I just give multiple agents all the poop work.
catfrogbigdog@reddit
METR is a bit biased in favor of the labs because leadership there is mostly ex OpenAI, DeepMind and Anthropic.
SansSariph@reddit
It's good to be aware of bias. We can take that information and use it to scrutinize study design and how the bias could influence results, as well as remain aware of it when taking their analysis of the results at face value.
There is risk in treating a biased study runner as invalidating findings.
I'm saying this only because I can imagine someone reading this comment and thinking that means the data is not interesting or worth looking at closely.
catfrogbigdog@reddit
Yes exactly. I’m not at all trying to be dismissive but highlighting that METR is biased. In particular the organization’s goals are oriented towards identifying existential risks: https://metr.org/about
This point of view is very popular on social media and in the frontier labs but there are plenty of AI researchers that speak out against this point of view. Yan LeCunn (ex-Meta/FAIR now AMI) and Francois Chollet (Keras / ARC-AGI) to name a few.
kyoob@reddit
Man oh man you would never have me answering this kind of survey.
Whitchorence@reddit
I think people will see whatever they want in the data and choose studies that flatter their worldview, like they do with every other subject.
skdcloud@reddit
CEOs have drunk the Kool-Aid and 99% of LinkedIn posts are snake oil, but if you try to ignore the gaslighting from non-technical people, it can do some really helpful things.
I use Amazon Kiro, and am not familiar with other AI tools to say if its better or worse, but have gotten it to do some cool things.
I work at an enterprise company with legacy tech so quick AI projects in modern languages helps keep me sane.
The other day a jr dev expressed a desire to learn modern frameworks, so I picked up one of her tickets, pointed the AI against our product documentation, and got it to build a basic react foundation layer with mock data from scratch, then got it to implement her ticket on that foundation. It looked sensible and gave the jr a starting point to learn modern tech.
I also got it to document our database of 1k tables which was previously undocumented and made it draw a few ERDs and give summaries about usages of tables. This is particularly helpful to me.
We also use salesforce (it's terrible) and I got AI to rebuild the app screen by screen in react so its easier to spin up, generate test data for, etc. This project will never see production but is really helpful for me to navigate a sibling teams app without a salesforce license.
Another problem its solved is AWS documentation. Any time you use something that isn't explicitly described as supported its really hard to know if its supported or not. AI is helpful for scraping a dozen public blogs and questions to correlate these edge cases. I struggled to fully understand KMS key rotation the other day, and used AI to get a clearer answer that if you rotate a kms key for RDS in prod and never restore from a snapshot, your data will never be reencrypted. This was important to our security guy as we were documenting key rotation and weren't aware of what it actually meant. It also helped me identify that if we ever manually rotate a kms key and delete the old key, we could lose access to any data that wasn't reencrypted. It also helped me tie in how block level encryption works with RDS and how encrypted data can be queried, something I hadn't really thought about before. Generally I'd need to speak to an aws architect to learn this, or spend a week reading documentation and blogs and hope I'd combined the information together properly.
Another use case, using copilot against all of my onenote notes. I can ask it any question and it will query all notes I've taken from all meetings. This is really helpful when I need it.
None of this changes the snake oil being sold, nor justifies firing anyone, but I find it genuinely useful.
bestjaegerpilot@reddit
this is patently false.
* per feature, AI can be slower. Ex: it adds back useEffect despite that's not best practice anymore. So you have to fix it, then it breaks something else etc. So then you can work on tooling to address future hallucinations but it it's still slow... the main value add is...
* i can work on several features in parallel now
* in fact, i'm working on about 3 or 4 big ticket items that couldn't have been possible before by myself. Ex: upgrade masssive codebase to newer node, and tsgo.
* so overall, I'm working on more impactful things in parallel and because in parallel, I can deliver more things at once (so individually maybe individual features go slower but output is more features at once). This is similar to working on a larger team. Individual productivity can go down (due to additional reviewing, mentoring, etc) but team productivity can go up
* the paradox of AI i guess
SansSariph@reddit
You're using your own output (N=1) to claim a study is "patently false" when the study and its predecessor also discuss the skew of self-reported data.
That's an interesting choice!
Even if your self-report is accurate, outliers exist and so it's meaningless to say a statistical result is "false". Where's your data?
roger_ducky@reddit
The main thing with working with AI agents:
You get the same kind of “decline” you see in team leads and architects.
They see the entire field more clearly but are less sure of the exact details of the implementation.
That, I don’t think, is necessarily a bad thing, as long as they can dig in when actually necessary.
People who delegate all their thinking to others usually get managed out eventually.
fsk@reddit
A year ago is too old to be useful. People claim the AIs are orders of magnitude better now.
If their methodology is paying people to participate, that naturally leads to bias. The people who are expert AI assisted coders working in Big Tech have better things to do than participate in a survey.
aaaaargZombies@reddit
OK just from the title of the post but
Are they actually faster then?
CalmLake999@reddit
Only the bad devs are slower. The good devs who can read fast, and know how to architecture and write correct rules are 30-50x faster.
damnburglar@reddit
23 YoE staff Eng and avid AI user here, this is wrong and like all productivity metrics I don’t know if we will ever get an accurate picture.
Architecture drifts are extremely common, regardless of your rules etc, and without thorough review and testing (and review of those tests) the generated code is good as a prototype/demo and that’s all.
If you stop measuring at “I wanted X and it created it for me in a day instead of a month”, then yeah, accurate. This doesn’t factor in code/security/architecture review, or really the rest of the SDLC for that matter. I just spent a month doing a pilot with a prospective client and it ended up being round the clock debugging, modification, and addressing edge cases on code I generated in maybe 4 days. This is not my first rodeo, but the amount of work required to get from A to B didn’t really decrease overall, it just shifted where the pain and time sink was.
This experience has been consistent across all of my client projects as well.
CalmLake999@reddit
I'm finding the exact opposite. If you actually invest a bit more effort in getting the AI to make demo data and test on it's own you will find out you barely need to do anything.
gefahr@reddit
You're shouting into the void. Some people are using it well and some aren't. I saw orders of magnitude differences in engineer productivity before LLMs, and AI has widened that same gap. Better engineers got better, worse ones got worse.
CalmLake999@reddit
Yep. Seems so.
damnburglar@reddit
Getting the AI to build demo data is actually my number one tip every time someone asks about going from vibe coder to actual developer, so 100% aligned there.
While I suspect it to be largely domain- and complexity-dependent, I’m curious what your validation process looks like. For me the time to review and ensure architecture is consistent, test coverage is adequate, and tests themselves are actually useful, combined with client feedback and iteration (plus firefighting) is eating up the lions share of the productivity gains
CalmLake999@reddit
Honestly, I make sure it references clean code books, and that every feature is extremely independent in it's own dirs, no cross bleed (that's the biggest curse). I use langauges with clean and good training sets (rust and svelte for example). I use simple architecture that can achieve anything; the main thing is speration of concerns.
Ok-Entertainer-1414@reddit
What the fuck would being 30-50x faster even mean? Does anyone seriously believe this shit? Think about what 50 (business) days of software productivity looked like pre-LLMs. Nobody is doing that shit in one day now. It would be obvious if that was real; we wouldn't be arguing about it. Come on
CalmLake999@reddit
Yes they are. Who are you hiring lol? Today I delivered a system that would have taken 3 months in 2 days and it's even better.
Ok-Entertainer-1414@reddit
Today, on a day that is famously a business day... yeah, that happened
CalmLake999@reddit
I don't know what business software you're in but what I work with we deliver features extremely quicky now. We had to scale up QA to test those features, but the quality is extrmely high. And yes I'm working Saturday for a business cause I'm addicted. I actually have 8 projects open now all with agents working on them.
Ok-Entertainer-1414@reddit
You can't seriously expect people to believe that you've simultaneously 30x'd your productivity and are still working weekends
CalmLake999@reddit
I have 30x it, but now I' working on 30x projects 😂
Early_Rooster7579@reddit
I know anecdotally I am certainly faster. As someone with pretty bad adhd its definitely made a noticeable difference for me.
tiajuanat@reddit
I made a functional clang compiler backend and Rust integration for a processor designed in the 70s in like a week. I think doing that without LLM help would've taken me like a year or two.
Now the real question is can I drop that in for the application I have, which really needs more testing to make sure it's solid.
OhMyGodItsEverywhere@reddit
I think I am waiting on more data and research before drawing any strong conclusions:
Might loosely hypothesize: "People may write some code some amount faster using tools after using them for a year."
rwilcox@reddit
My bet is that places - when they claim to measure dev productivity, potentially we’ll potentially not, will say two things:
nomoreplsthx@reddit
This is unsurprising.
When one study on a topic is an outlier, you should usually assume it's the wrong one. No one else had produced similar results to the original study.
Mad props for them doing their due dilligence and admitting mistakes
As for what it means, if you thought the original study was correct, you were probably smoking that delicious cherry-pick flavor hopium. 20% is much more in line with other research.
But 20% is also not a 'replace all engineers' number. Which aligns with facts on the ground - very few companies are successfully vibecoding whole applications, but efficiency gains are leading to leaner teams.
Whether this changes depends on how the tech changes. It could plateau. It could get vastly better. No one knows.
MoreRespectForQA@reddit
-18% and -4%?
Rymasq@reddit
No shit, anyone who is good with AI knows they are more effective with it.
I'm basically making minor bug fixes exclusively with AI. Saves me so much time.
AI is so useful for debugging and minor redactors too.