Ai coding tools just turned me into an exhausted babysitter

Posted by frankgetsu@reddit | ExperiencedDevs | View on Reddit | 60 comments

spent like three hours today debugging a PR that looked syntactically perfect but had a massive logical race condition buried inside it. our juniors are basically just pasting LLM output now and praying it passes review.

Management thinks we are shipping way faster but im honestly just spending all my time acting as a human linter for a glorified autocomplete bot. Probabalistic models guessing the next token are fundamentally flawed for core business logic. it just doesnt scale without burning out the senior engineers

it does give me a little hope seeing the research side finally pivot toward formal verification. like seeing newer reasoning agents like Aleph being evaluated on strict machine-readable proofs rather than just checking if a python script runs once without crashing

until the enterprise tools move from probabilistic guessing to mathematically provable state transitions, this whole era is just generating an ocean of technical debt that we are gonna have to clean up later

[-]

psyyduck@reddit

If you put back the punctuation, Pangram says this is an AI post.

[-]

drearymoment@reddit

If you reverse the two clauses and remove the comma, Pangram says your comment was written by AI.

[-]

ButterflySammy@reddit

Except look which letters are capitalised and which are not.

They're very "code processed this text" about it.

It's not a person who does badly at following grammar rules, but someone precisely following non standard rules.

Always the first letter at the start of a sentence for example, but they capitalise other things.

[-]

Izkata@reddit

The missing fullstops... but only at the end of paragraphs.

This one is a Gen Z extension of something Millennials have done for decades: When context from the UI makes it obvious where a sentence ends, the final period is unnecessary, and including it adds a serious tone of voice to the text. For Millennials is was pretty well confined to texting and IM/chat where you'd typically only be sending one sentence per message anyway, with Gen Z it seems to have expanded to a lot more places.

There was a minor blowup on social media some years ago when people discovered studies on this and finally learned it was a thing: https://www.npr.org/2020/09/05/909969004/before-texting-your-kid-make-sure-to-double-check-your-punctuation

[-]

drearymoment@reddit

That sounds a little far-fetched in my opinion. There are plenty of LLM tells, but if we start imagining that people generate their writing with AI and then run it through some sort of tool to strip away the LLM tells and humanize it, then I think you're going to get a ton of false positives.

I mean, what are we calling out here? Inconsistent capitalization and punctuation? Subsequent uses of autocorrect? That's just how a lot of people write, especially in an off-the-cuff format like reddit.

[-]

ButterflySammy@reddit

No. VERY consistent capitalisation and punctuation despite being wrong with a single error where the user clicked and whatever autocorrect they're using activated once. IE real people who're bad at grammar aren't this level of perfect minus a specific transformation.

If they wrote they entire thing in a single editor, autocorrect would have fixed every time they did it, as they copy pasted it from elsewhere they only fixed the one time they clicked the text.

I stated two very specific rules that are very easy to apply programmatically.

If, now I've explained it to you, you don't start to see it then it is that you are bad at perception.

[-]

drearymoment@reddit

I see what you're getting at, but it strains credulity to imagine that the post was generated by an LLM and then programmatically altered to humanize it and strip it of its tells. The simplest explanation is that the user just writes in a peculiar format.

[-]

ButterflySammy@reddit

Why do this user and several other different people all have this exact trait?

I could believe it if there was one, but it's an oddly specific thing for multiple people to do.

[-]

drearymoment@reddit

I haven't noticed that pattern. Maybe you're right that I'm not as perceptive as you are.

[-]

x-jhp-x@reddit

My pangram told me that the quick brown fox jumped over the lazy dog!

[-]

annoying_cyclist@reddit

Brought to you by the content marketing in OP's profile.

[-]

Perfect-Campaign9551@reddit

Is definitely an engagement bot

[-]

PastaGoodGnocchiBad@reddit

GPTzero too, thank you for noting.

[-]

sparklikemind@reddit

Sounds like you don't need juniors anymore

[-]

AccomplishedLeave506@reddit

It's not that we don't need juniors any more, it's that the juniors never stop being juniors because they are using AI tools to do the "thinking" for them. Tools that spit out junior level quality. So now all I have is juniors. Forever. Who produce code at the speed of light. Nightmare.

We need juniors who are actually learning how to engineer software and not type a prompt followed by cut and paste. I can hire the guy who lives under the underpass to do that. He might even get bored and start learning how to do the job instead.

[-]

mile-high-guy@reddit

The AI tools will only get better over time. It won't be junior level code

[-]

AccomplishedLeave506@reddit

The only people who think that AI can do a job are people who can't do that job. It produces garbage. Shiny garbage that looks good to people who don't know what they're looking at. But garbage none the less.

[-]

BROTALITY@reddit

Listen, I'm not pro-AI. The code that it produces isn't garbage, but it takes effort to understand what is and isn't garbage. That's the current disconnect between the senior and junior level at this moment in time, it would seem. Juniors put in a prompt and copy/paste the results without thinking about it, and that is absolutely garbage. A senior that knows how to guide the AI into writing good code and push back against the slop can absolutely fly

[-]

AccomplishedLeave506@reddit

I'm a very senior engineer. I've tried the various AI tools. They all produce crap. I can do it quicker and better without their help. And so can the mid level engineers I work with. The ones who have started using it are slowly becoming worse engineers as they stop thinking about what they're actually doing and let the AI write junk.

I've had 3 pull requests today that we're complete crap. On the surface they looked ok, but actually large parts of them made no sense. One of them made such little sense that it was completely scrapped and restarted. It did what the ticket wanted. It had until tests that all passed. It was so badly thought out that it couldn't be used. Only people who don't know how to do the job think it can do the job.

[-]

BROTALITY@reddit

I'm not disagreeing with you. I find the people who are fully embracing AI at my current place are also producing the worst code. I just think if you use it as a tool/pair programmer type of thing and not fully replace your critical thinking, it's very useful

[-]

Izkata@reddit

The problem I've heard from two different co-workers (one directly, one indirectly from a teammate) is that it's too easy to reach for so no matter how they try to resist they drift towards that mode of work. The phrasing from the first one made it sound a bit like an addiction.

[-]

BROTALITY@reddit

It’s probably by design, considering how hard it is to put down phones/social media now. Get people hooked so they’ll pay

[-]

Beginning-Cream7813@reddit

the thing is, there's going to be far fewer seniors in the not too distant future as well (or at least that's going to be the aspirations of management). People who think they can park themselves as "senior devs" are gonna have an awful lot of competition for a dwindling number of jobs.

[-]

ButterflySammy@reddit

We call this velocity now.

Management has always done this.

The always pick one number and measure it till it destroys everything and they need to stop.

Once upon a time that number was KSLOC — thousand source lines of code — and everyone produced the same code at the same speed, but wrote a version of it that used more lines.

[-]

ElGuaco@reddit

I'll say it again. AI without a continuous long running context is worse than a junior engineer when it comes to design. Keep your AI requests short and the domain small.

Until they invent an AI that can keep the context of all the history of the apps development, they are at best a pair programmer for short jobs. If you're asking AI to design everything then good fucking luck.

If your AI code can't be easily read and understood, it's probably wrong and you're asking it to do too much. A race condition? Are you really asking it to write multi threaded apps? Jesus.

[-]

Daishiman@reddit

Until they invent an AI that can keep the context of all the history of the apps development, they are at best a pair programmer for short jobs. If you're asking AI to design everything then good fucking luck.

You can get this to a very good approximation by generating a very well-specced plan and having each subtask be performed by subagents which are fed the proper context for each plan and with a sufficiently good test harness such that your goal is clear.

[-]

Ok_Individual_5050@reddit

That's a completely different thing than knowing the context of the project

[-]

Daishiman@reddit

Sure, if you have no idea of what you're building and you don't keep records so that even staff barely remembers what they're supposed to be building how do they expect a single-minded LLM to do that work for them?

[-]

therealslimshady1234@reddit

Probabalistic models guessing the next token are fundamentally flawed for core business logic. it just doesnt scale without burning out the senior engineers

This, and only the mentally challenged AI bro does not understand this.

AI has just been a very complicated way of trading quality for speed. No productivity gains have been made, you are just cutting corners in an opaque way

[-]

Daishiman@reddit

Wrong. At this point if this is happening in your codebase you likely just don't know how to write good specifications and steer a model.

[-]

therealslimshady1234@reddit

>Youre prompting it wrong
Oh boy 🤣

[-]

SamurottX@reddit

Every criticism of AI is met with "better prompts", "newer model", and "more markdown files"

[-]

Daishiman@reddit

I have written reams of critiques and criticized policies within and outside my company regarding AI usage to no end. I don't drink the coolaid.

But there sure are a lot of people who try tools, fail to learn them, then complain when they inevitably don't work. It's endemic in the industry. Some tools have a steep learning curve and others not so.

Learning to prompt and use tools like Claude Code actually takes some skill. If you're finding it "completely useless" in run-of-the-mill tasks at this point it's really not the tool's fault.

[-]

LogicalPerformer7637@reddit

I have the same experience. If you write good specification of what you want (good requirement) then AI can provide good results.

[-]

x-jhp-x@reddit

Although I agree with the first part, there are some productivity gains. AI significantly speeds up some of the tasks I felt were boring, like documentation, writing tests, or if I need some simple boilerplate code.

[-]

NGTTwo@reddit

I've said this about 1000 times over the past few years, but I'm very much of the belief that, if you're treating your tests as boring boilerplate, you're fucking doing it wrong.

[-]

x-jhp-x@reddit

That is understandable.

I suppose I should add that I have mostly worked in regulated industries, where we are required to plan a lot in advance. By the time I start writing code, I already know what I'm writing, and already have a number of "plain English" things that I need to test written out. I also need traceability to requirements, so I need to demonstrate that my code meets the requirement with a test.

An example from something I did recently was having a function that writes data to disk, and I need to test what happens if the disk doesn't have enough space & return a specific error code if the disk is full. If the write fails after confirming that the disk had enough space, I need to return a specific and different error code for that as well. We do a full SFMEA (software failure mode effects and analysis) and more, so I have tons of examples of that.

AI is nice because writing those examples & the hundreds of others I am required to do is stupidly trivial, as is ensuring that they are correct even if I don't personally bang the keys, so it has been a time saver.

I don't use it for all unit tests, I don't blindly have it test whatever it feels like, and I have plenty of tests that don't show up in the requirements, but still confirm the functionality of the internal function I wrote to meet the requirements. I suppose the argument is that the traceability tests aren't 'real' unit tests, or that I separate the thought and coding stages, so that by the time I am typing, a lot of the test work feels like more boring & 'boilerplate' work.

[-]

CannotProveAThing@reddit

Is the documentation clear and concise and helpful or is this one of those things where management doesn't care about that as long as it gets done? I've used it as a starting point before and I felt like there were a lot of bullet points pointing out the obvious.

[-]

zabolekar@reddit

or is this one of those things where management doesn't care about that as long as it gets done?

I wish more people would ask themselves that. I've seen tons of documentation of the kind "/// @param firstIndex The first index", written by actual humans, just so the reviewer won't ask why there is no documentation. Automating a useless task won't make it useful.

[-]

x-jhp-x@reddit

I've had to use, and like reading, pydoc/doxygen/texinfo. AI can fairly easily write 95% of that for functions and the like, and all I have to do is fill in a bit of additional info.

If I'm using a library, or feel like I need to copy+paste a paragraph or something from another source, I've frequently had to copy it from the pdf, post it in an editor, and reformat it. The guy next to me wrote an emacs function to handle that though, so it isn't like AI is the only tool for these things. It is definitely a small quality of life & time saver though.

[-]

therealslimshady1234@reddit

I mean some boiler plate code is fine, but every time I have to read AI generated documentation I just skip it entirely.

Why? Because I know no human has carefully looked into it, so there is a 99% chance the documentation is misleading or leaves things out.

[-]

x-jhp-x@reddit

Ah, my workplaces have almost all been fairly strict about documentation. I guess I should add that I've seen tons of terrible and/or outdated documentation from humans too, so the way I use it is to just reduce how many keys I'm hitting on my keyboard to do a task, which feels like a time saver and quality of life improvement. I still have to read it, make changes, and usually add content though.

I also use vim & have plenty of coworkers using emacs, so spending a few hours scripting certain operations is seen as fairly normal, even if I'm only saving myself a couple of keystrokes and maybe seconds of overall time yearly hahaha. If I have a repetitive task that takes 4 keys to do, and I reduce it to pressing 2 keys, I'm pretty happy.

[-]

fuzzyFurryBunny@reddit

yes, this is my problem. It makes absolutely no sense in most scenarios. If you think about the resources and what is being done behind the hood, it makes little sense. Code is just the instructions. Inferencing these instructions from prompt and then fiddling it to get it to do what you want... why are we going backwards. I question any real productivity gains unless it's very simple code that would have been copied from stack overflow. Cause to me it's making a bigger and riskier mess, especially depending on how critical the application is.

[-]

fuzzyFurryBunny@reddit

> Probabalistic models guessing the next token are fundamentally flawed for core business logic.

yes, this feels so indirect and round about way to do things. Except certain specific tasks, it feels like the opposite of efficiency...

[-]

GuybrushThreepwo0d@reddit

Formal verification will help diddily squat. But I'm probably telling this to a bot. God this sub has gotten terrible

[-]

Perfect-Campaign9551@reddit

Skill issue

[-]

Fuck-David-King@reddit

Using AI tools to write a post criticizing AI tools. Lovely.

[-]

bitloops__@reddit

I think the junior's job hasnt changed - they need to understand what they're shipping. The bar should be that they can explain the code change end-to-end before it goes in for review - not 'the AI wrote it and it ran locally'. That's a coaching and process problem that can be fixed today without waiting for formal verification.

What we have found is that the AI has no context of your actual system invariants, the architectural decisions you've made, what's deliberately constrained. It guesses locally and misses system-level things.

And actually, this is exactly the problem we've been building Bitloops (open source) to solve - https://github.com/bitloops/bitloops. Its about persistent memory for your architecture decisions and constraints so the agent isnt operating from a blank slate every time.

This doesn't solve everything, and I completely agree we need to move towards mathematically provable state transitions, which is the ultimate goal we have - build deterministic verification against the constraints and guardrails identified in each repo / codebase.

[-]

WhateverHowever1337@reddit

This is a hidden ad for that Aleph ai, pass

[-]

nameless_food@reddit

It's like using a calculator that's correct 80% of the time, and wrong 20% of the time to build a large complex machine.

That's on top of the need to specify exactly what problem you're trying to solve, as well as the solution for that problem. The description of the problem and solution need to be precise enough, and no amount of AI is going to make product owners be more precise in this area.

[-]

pja@reddit

OP is a bot & this is engagement bait. Search "frankgetsu site: reddit.com" and look at all their other posts. They post all over the place & the posts contradict each other.

[-]

BROTALITY@reddit

I feel like I'm just becoming an AI babysitter, except the AI is also that annoying nitpicker that blocks your PRs from ever crossing the finish line. My output as a developer is faster than ever, but the coding was the fun part. Now all I'm left with is the red tape and annoying BS that I didn't enjoy doing in the first place and it's killing my enjoyment of the field

[-]

Additional_Rub_7355@reddit

The thing is AI reliance overall isn't going away, simply because quality is not a priority in this industry.

You wanna do proper programming with high standards? That would be your own projects, outside of "working" hours.

[-]

Clyde_Frag@reddit

Shouldn’t the junior be reviewing their own code first? And if they’re not able to catch issues like this one, they need to be coached up.

[-]

ButterflySammy@reddit

If management doesn't know, it's your job to inform them.

I've never seen so many nerds back down from a factial argument where they have the evidence before.

[-]

account22222221@reddit

Don’t fix it for them fail the review and let the person know they need to fix it and are welcome to set time to ask questions

[-]

WildWinkWeb@reddit

Time to add “AI wrangler” to my resume.

[-]

lfelippeoz@reddit

If they are pasting LLM outputs, thats not the right way to do it.

Also, there's a lot you can do on your end to create verifiable loops.

I suggest you get your own hands dirty with some AI tools like OpenCode, Claude Code, Cursor, Kiro and develop better workflows around those tools.

Better models will not bring coherence without appropriate control layers around them

[-]

Daishiman@reddit

Yeah, while complaints about AI usage are extremely valid for a lot of organizations, if you're copy-pasting code then they're also lacking the bare minimum to work with AI.

[-]

AnnoyedVelociraptor@reddit

Formal validation still requires you to enumerate what you need to be present.

And only about 10% of the requirements are captured in the ticket.