speaking out against AI fearmongering

[-]

officerblues@reddit

I work in AI, training this type of model. I do not use them in my setup anymore, it's just as fast to actually learn a stack and use it, long-term. I agree with everything that's been said here, whenever I tell people I think AI coding is shit, they look at me like I'm grandpa. I used to fear it, but now I'm really happy about what I see, because I can literally be a 10x programmer and just do actual good coding when no one else is interested in it. The future is bright.

[-]

FortuneIIIPick@reddit

> I work in AI

> The future is bright.

Doughnut maker likes dough, news at 11.

[-]

officerblues@reddit

Lol, that's what you got from the message? Well, I guess I did write some shitty text.

[-]

ALoadOfThisGuy@reddit

The future is bright for us geezers who learned and grew in an LLM-less environment. I’m frightened for the next generation that will be conditioned to shortcut and slap together and pray. I’ve already got coworkers who feed their entire day into an LLM and are incapable of producing a single, unique thought.

It would also be nice if we stop thinking about AI as singularly LLMs, but I’m going to be dealing with that one for a while.

[-]

PoopsCodeAllTheTime@reddit

So.... When is the rain of dolla bills gonna come for people like us? I'm dehydrating very fast and I still don't see around the corner of the job market

[-]

another_account_327@reddit

I'd say there have always been programmers who have just been copying existing from StackOverflow or whatever without understanding it. Works until you encounter an error. Don't think it's too different from using an LLM.

[-]

creaturefeature16@reddit

I want to think its the same, but I really do think LLMs change the game for newcomers.

I think its different in the sense that you can create such highly contextually relevant code, within your own codebase, that will likely run/compile/execute, completely circumventing the potential for gaining that deep knowledge that comes with piecemealing things together until you realize you're learning about data structures before you know it.

[-]

hkric41six@reddit

💯

[-]

SomeEffective8139@reddit

I think LLMs will be huge but it's going to be more as an automation tool than a magical power that lets you replace hundreds of workers with a single AI. Having models talking to models over MCP, it then becomes possible to do things using a prompt as an API call to say, "update the inventory system to mark package ABC123 as delivered" and the model knows what to do under the hood.

I don't see this replacing developers but rather making new work for developers to support the automation. I liken it to the "big data" boom 10-15 years ago.

I agree with you that the layoffs are not really due to AI. AI is a good excuse because it makes it sound like the company is investing in growth and new technology rather than the reality that they just cannot borrow more money to keep their terrible investments afloat.

[-]

FortuneIIIPick@reddit

> an automation tool

AI isn't a tool it is nondeterministic. You're free to ask any AI you want, it will tell you, it's not 100% deterministic. It can be helpful or a huge time waster but it isn't a tool.

[-]

stevefuzz@reddit

Agreed. The actionability of LLMs is the real product.

[-]

SomeEffective8139@reddit

Yeah I think we're seeing that in the startup space now. Many companies started out in the last year or two with a ChatGPT-like product where you login and have a big text prompt. Now I'm seeing more stuff that is built with "AI" where the system in the background is a combination of an LLM, classic ML with a specially trained model, analytics, and automation. The user thinks it's "AI" but they don't actually write prompts. "AI" to a normal user just means "fancy automation." I think we are finding that prompt engineering is a lot harder than it sounds and users don't actually want to have to type out a prompt.

[-]

BlazeBigBang@reddit

Having models talking to models over MCP, it then becomes possible to do things using a prompt as an API call to say, "update the inventory system to mark package ABC123 as delivered" and the model knows what to do under the hood.

How is that different from filling a textbox with "ABC123" and clicking on "Delivered"? Is it just the voice recognition to API call?

[-]

SomeEffective8139@reddit

It's not voice recognition at all. It's a prompt. An LLM takes text prompts, not speech.

The difference is that nobody had to code up a special an endpoint to process the API payload. So whenever you want to do some new thing with the delivery system, you don't just look up an API endpoint doc and write a bunch of new integrations, you just describe it in the prompt and the model figures out what to do. MCP lets my model talk to your model, without either of us having to read API documentation or write code.

The end user might not even realize there is an LLM involved in whatever action they're taking. The prompt is probably just a string that our system builds and feeds into the model.

[-]

Kaimito1@reddit

I just ignore the fear mongering at this point. It's a constant thing on LinkedIn.

9/10 times if someone is pushing AI, they're a vibe coder or have a "no code coder consultant" kind of business

The 1/10 times it's an actually good dev saying "yeah ai won't replace us but it's worth using it as a tool to make you better. Just be careful as it can stop your skill growth"

[-]

dodgerblue-005A9C@reddit (OP)

i'm questioning the 10th guy as well, it's an opaque post on any social media. we've to take them for their word, no way to question the fundamentals.
we're fundamentally critical thinker, or at least are supposed to be. not providing any reproducible evidence doesn't help their case

[-]

PoopsCodeAllTheTime@reddit

I have yet to see a single screen cast where LLM code looks truly productive 🥱

It's funny how there are all these people claiming that LLM makes them oh so awesome, but no one can look at this process? Literally just screen record for 30 minutes if it's so awesome! 😂 They would get a huge audience in no time too! Obviously they don't show us 🙄🥱

[-]

riotshieldready@reddit

I use LLMs to do my easy work. Of If I need a new endpoint that isn’t too complex and we already have some pieces we need I will tell the LLM to write it for me. It will save me 20mins of doing it myself.

Then when I need to call the AI in my client side and display it on a simple UI I will upload the image of the UI to my LLM, give it clear instruction on what and where to do it. It will mostly get it correct then I’ll give it a few more prompts. Saves me another 30min-1h, mostly messing around with tailwind.

Then finally I’ll ask the AI to write some tests.

It doesn’t do anything I wouldn’t do myself. I will have to edit some of the code but as a whole a task that would take me half a day can take me 20mins.

However last week I was doing some major changes to our RBAC and I wanted to give the LLM a chance to see how it would do it. It couldn’t do a single thing. None of the code any of the LLMs gave me was remotely close, or even did anything. It didn’t even seem to fully know what a JWT is or how it works.

Tl;dr if you know what you’re doing and the task is pretty straight forward you can be very productive. If the task is more complex or requires understand your unique setup it sucks.

[-]

PoopsCodeAllTheTime@reddit

Yeah exactly, well for me it doesn't even save me that much time, LLM is good at writing HTML tables because that's just a mindless task, but it's not that large of a difference because I still have to go over all it's code, and I type fast anyway so... it's ok to rest my fingers for 10 minutes, it doesn't feel like the difference is significant, and as you say, asking it to do the wrong task ends up being a net negative as it takes too long to review + consider whether you would rather correct it or undo its entire change.

I feel faster when I already got a similar file and I just copy paste it lol. LLM feels ok when I need a generic task that I'm sure it has seen many times in its training data. But that's it, it just lets me be a little lazy with my fingers, doesn't feel like a significant time save, and ends up counter balancing the good with the wrong.

[-]

marx-was-right-@reddit

I watched a AI demo bomb at a live tech conference. Was fucking hilarious. Dude was scrambling to try and act like it doesnt normally do this

[-]

congramist@reddit

I’m the 10th guy. I genuinely don’t think I could convince you or any of your crowd regardless of what I say, so I typically don’t bother, precisely because of what, imo (and its just my opinion) are arguments such as the one you just made.

That said, I don’t need reproducible evidence to convince myself of a tool’s worth. I can tell intuitively that using a chainsaw is much easier and makes me more efficient than using an axe without collecting or analyzing a single datapoint.

I can also agree that part of the responsibility involved with using a chainsaw is that I need to pay much closer attention to its operation to avoid cutting my toes off. A chainsaw costs more. A chainsaw requires fuel, lubricant, chain sharpening, and much more maintenance. A chainsaw requires that you learn how to operate it.

My choice to use one or not is personal, and if you like the exercise then hold on to that axe, but the idea that you need some sort of reproducible evidence from someone else to convince you of the worth of a huge tech advancement is a bit odd to me.

I could be wrong, but given this rant came after being rejected and seeing your comments throughout the thread, I am guessing this, as many of the “AI is useless trash” comments are, is emotionally driven.

[-]

wwww4all@reddit

The question is are you actually using a tool, learning how to use a chainsaw? When the AI gives you finished product?

[-]

wvenable@reddit

If need to do something, google for it, read stack overflow, find a good answer, and use that did you learn anything? How did you know it was good answer?

[-]

wwww4all@reddit

The process of searching, reading code examples and digesting code discussion points from experts and sources, are the discovery and exploration parts of the learning curve.

The AI hype is about generating code at faster pace, that may apply to given prompts and context. Basically shortcut the curve. If you don’t know FE tech stack or rust, what are you learning when looking at generated code? When it seems good enough to sign LGTM and move on.

[-]

wvenable@reddit

The process of searching, reading code examples and digesting code discussion points from experts and sources, are the discovery and exploration parts of the learning curve.

Have you used Google in the last decade? Most of the discovery is wading through SEO crap, people asking the same question as you without an answer, "closed as duplicate", etc.

If you don’t know FE tech stack or rust, what are you learning when looking at generated code?

When the AI does something I don't understand (or don't trust) then I ask. Sometimes I even ask for sources so I can read the actual thing.

I actually find AI useful because I'm curious about things not because I'm looking for a quick fix.

[-]

congramist@reddit

And the answer is yes. Have you never looked for code you didn’t write on the internet? Have you ever asked a coworker for advice on how to solve a problem that you yourself didn’t conceive?

[-]

wwww4all@reddit

The classic adage apply. Give a man a fish, he eats for the day. Teach a man to fish, he eats for the lifetime.

Reading code and discussing code with coworkers are part of the learning curve, where you have to dig deeper to find context and practice the logic application. Because at some point, you have to become independent.

The whole point of AI hype is to shortcut the curve, so that you can get code faster. Being given the fish, not necessarily being taught to fish.

Coding is like any other skill, use it or lose it.

[-]

congramist@reddit

Any tool is that way. AI is no different. You can use it as a means to learn and understand or you can use it to take shortcuts.

It is no different than using a calculator in a math class. You get to the bottom of things faster but you still need to ask yourself: does this align with what I know about the problem at hand?

[-]

ZorbaTHut@reddit

I’m the 10th guy. I genuinely don’t think I could convince you or any of your crowd regardless of what I say, so I typically don’t bother, precisely because of what, imo (and its just my opinion) are arguments such as the one you just made.

Yeah, same here.

At some point this comes down to "don't interrupt your enemy when they're making a mistake, especially when they're going to yell at you for it". I like other programmers, I hope they're successful, I want everyone to use the best tools . . . but if I have to weather verbal abuse in order to convince people to try out tools that I've found valuable, well, why am I going through all of that just to force people to try to be more productive? I'll just keep the productivity boosts for myself then, fine.

And I have a lot of friends who have made the same or similar decisions.

[-]

congramist@reddit

Precisely. Some days I wake up feeling froggy on a Friday though 😆

I think part of the issue is exhaustion. We were already expected to keep up with such a fast changing scene, while also being biz analysts, project managers, IT help desk, etc etc. The fatigue is real. But it would be nice if we could acknowledge these types of things instead of just waving away AI entirely just because it cannot fully replace your job (and it definitely cannot, let’s be clear)

But that’s kinda the cool thing about these tools; I can iterate through a lot of the bullshit now to actually focus on the types of problems that lured me into the career in the first place. I think it’s overhyped, sure, but to deny the utility is nuts to me.

[-]

ZorbaTHut@reddit

But that’s kinda the cool thing about these tools; I can iterate through a lot of the bullshit now to actually focus on the types of problems that lured me into the career in the first place.

"Alright, first unit test is implemented. Now I just need to do . . . twenty-six more, all very similar but slightly different in important ways.

. . . Hey, Claude, ol' buddy ol' pal! How ya doin'! I've got some work for you."

[-]

congramist@reddit

“Ugh but then I have to read the test it wrote me and make sure it is right” 🙄

Cracks me up man

[-]

marx-was-right-@reddit

Im not sure im following. This point is extremely key to why AI isnt a productivity boost. if you have to meticulously comb through unit tests before even getting to the PR phase, I would be more efficient to just writing them myself with basic templating and IDE tools.

[-]

ZorbaTHut@reddit

The thing is that reading code should be faster than writing code, especially if the code is reasonably coherent and the functions it calls do sensible things.

A while back I was on a project that had a bunch of 3d geometry classes, Vector2, Rect2, Vector3, and Aabb. They also had a bunch of integer versions of them, Vector2I, Rect2I, and Vector3I. Notice anything missing? I sure did: I needed AabbI and was immediately irritated that it didn't exist.

The basics are easy, but there's dozens of little convenient utility functions that I wanted, and I did not want to write all of those by hand.

So I pasted the sourcecode for all of those into Claude and told it to write me AabbI (except in C# so it would live in userspace).

The end result was something like a thousand lines of incredibly dull code. It took me something like 15 minutes to read through it and fix a few minor mistakes. It would have taken me at least two hours to write it, though, and I probably would have made mistakes as well, which I would have been relatively blind to because I wrote them in the first place.

Having someone able to just throw a mostly-working first draft at you is often a huge net win, even with having to read the code and fix a few problems.

[-]

marx-was-right-@reddit

Most business problems do not require thousands of LOC. And time spent typing code out is not a very significant time sink for high level SWE's. Most time is spent designing, troubleshooting, deploying, looking for edge cases,etc. Hands on keys typing code out is probably the easiest part of the job, and the time you spend correcting the LLM mistakes easily exceed any productivity "gained" from generating code

[-]

ZorbaTHut@reddit

And time spent typing code out is not a very significant time sink for high level SWE's. Most time is spent designing, troubleshooting, deploying, looking for edge cases,etc.

Sure, I absolutely agree. But it's nice when there's an AI nearby to make it even faster.

And I've had success using AI to help with debugging; I actually found two bugs once by just copypasting a function in and saying "I think there's a bug in this, can you find it".

and the time you spend correcting the LLM mistakes easily exceed any productivity "gained" from generating code

There are many people in this thread disagreeing with you. Your statement is absolutely not universally correct.

[-]

marx-was-right-@reddit

There are many people in this thread disagreeing with you.

I count 2 out of hundreds. You and that congramist guy

[-]

ZorbaTHut@reddit

Yikes . Doubt it was a very difficult task. Im talking about debugging distributed systems, not a helper method in your interns java app.

Who cares? Time saved is time saved. We've all dealt with simple bugs, and it doesn't have to be good at everything to be useful.

In the case I generally think of, it was one of those annoying typos where there were two similar variables and I'd used the wrong one. I looked at the function, didn't see anything obviously wrong, and was about to go to bed, so I tossed it at GPT as kind of a hail-mary.

GPT found the variable typo and also a math mistake I'd made that I would have run into next.

(Amusingly, it called the math mistake I hadn't even realized was there "the bug", and then said "oh, and I found another bug!" and told me how to fix the one I was aware of.)

I count 4, compared to several dozen with issues in the thread.

Your edit history:

I count 2 out of hundreds. You and that congramist guy

I count 3, compared to several dozen with issues in the thread.

I count 4, compared to several dozen with issues in the thread.

So, uh, let me know when you're done counting I guess.

[-]

marx-was-right-@reddit

I finished.

[-]

ZorbaTHut@reddit

I finished.

One two three four five.

Would you like to try again?

I could ask an AI to help if you'd like.

And a variable typo for your debugging example is actually embarassing. Good lord man

Show me a programmer who never made any dumb mistakes, I'll show you a liar.

[-]

marx-was-right-@reddit

I could ask an AI to help if youd like

Sure thing. Make sure to review the output and check your work!

Show me a programmer who never made any dumb mistakes, I'll show you a liar.

Dude, you used a literal typo in a helper function as examples of an LLM "debugging". An IDE will literally underline your typo in red if it doesnt have a reference and you can correct it before the LLM chat prompt on your web browser or IDE even loads.

How about use it for something, idk, that isnt an intern level task before extolling the virtue? Cuz thats whats being sold to the C suite, and exactly what the OP was originally about, "Experienced" devs not pushing back and even following along with insane claims.

[-]

ZorbaTHut@reddit

Sure thing. Make sure to review the output and check your work!

Based on this Reddit conversation, here are the people who are positive about AI or actively use it:

Actively Positive AI Users:

congramist - Uses the chainsaw vs. axe analogy to defend AI tools, says they can intuitively tell AI makes them more efficient without needing reproducible evidence
ZorbaTHut - Actively uses Claude and Claude Code, mentions keeping productivity boosts for themselves rather than convincing skeptics, describes AI as "really fancy autocomplete" in a positive way SomeEffective8139 - 40-year-old developer with 15+ years experience who uses Copilot and Claude, says Copilot helped write 100+ unit tests and improved code quality, claims 95% accuracy with Claude
potatoliche - Calls autocomplete features a "game-changer" with "huge impact," successfully used AI to debug complex Android OS issues that stumped human experts
AchillesDev - Senior ML Engineer who says AI is a "huge time saver for most tasks," prefers code review over writing boilerplate code
Empanatacion - Very enthusiastic, quotes "AI isn't coming for your job. Somebody using AI is coming for your job," excited about AI "flipping the table over," says Claude taught them in an hour what would take a day with docs
alpacaMyToothbrush - 18 years of experience, advocates for proper prompting techniques, believes those experienced with local models are more effective with larger models
fullouterjoin - Positive but frustrated with skeptics, advocates for proper AI usage with context and examples
wvenable - Team Lead with 30+ years experience, defends AI usage by comparing it to using Stack Overflow

The conversation shows a clear divide between experienced developers who have found practical value in AI tools and those who remain skeptical about productivity claims or have had poor experiences with the technology.

Amusingly, it got "potatolicious"'s name wrong. All the rest matches up though.

Sure was faster than going through on my own.

Dude, you used a literal typo in a helper function as examples of an LLM "debugging". Any basic IDE will literally underline your typo in red if it doesnt have a reference and you can correct it before the LLM chat prompt on your web browser or IDE even loads.

Oh man, I'm looking forward to this one.

What makes you think it would underline the typo? Because it didn't, and I'm curious how much of a hole you're going to dig for yourself here, and whether you're going to dig it before or after you go back and actually understand the original comment.

How about use it for something, idk, that isnt an intern level task before extolling the virtue? Cuz thats whats being sold to the C suite, and exactly what the OP was originally about, "Experienced" devs not pushing back and even following along with insane claims.

I have said multiple times in this thread that you should treat AI like a very eager novice. I have also said that this is very useful.

[-]

marx-was-right-@reddit

Damn, tbh i expected more people. Kinda disappointed!

[-]

ZorbaTHut@reddit

Looking through the conversation, here are the people who are negative about AI or express significant skepticism: Negative/Skeptical About AI:

Qwertycrackers - Had bad experiences with Copilot generating tests with "basically every mistake possible" and "sneaky and pernicious mistakes." Says they're "distinctly unimpressed" and AI has never produced something that actually works for them.
TheNewOP - SWE in finance, frustrated that Copilot "immediately hallucinated" on a basic API documentation task, asks "what's the god damn point?"
thephotoman - Had issues with Copilot giving wrong instructions, demands "reproducible evidence" for productivity claims, skeptical of quantified benefits
false_tautology - Experienced confusion issues with AI mixing .NET Framework and .NET 8
ghost_jamm - Argues that reviewing AI code "with a fine toothed comb" doesn't constitute a "massive productivity boost"
marx-was-right- - Very consistently negative, calls AI "a productivity drain," says IDE templating could do the same things, questions accuracy claims as "extremely sus," argues it fails on complex/scaled tasks
wwww4all - Concerned about learning degradation, asks "how will you retain any knowledge or experiences, if Claude does all the work?" Worried about shortcuts preventing skill development
dodgerblue-005A9C - The original poster, very critical of AI hype, complains about "blind faith," "dishonesty, compliance and lack of critical thinking" in the community
menckenjr - Cautious about inappropriate AI mandate by management, warns about hallucinations for new developers
CarousalAnimal - Makes sarcastic comment about "coding in a battlefield"

Summary:

Positive/Active Users: 9 people
Negative/Skeptical: 10 people

The conversation shows a fairly even split, with slightly more people expressing skepticism or negative experiences than those who are enthusiastic about AI tools.

So, roughly 50/50.

(I didn't bother checking this one, though.)

[-]

congramist@reddit

… you are just being disingenuous if you are claiming that the preexisting IDE tools could write tests as quickly as an LLM

[-]

marx-was-right-@reddit

Its much easier for my IDE to generate a template and i fill in the guts, than for an LLM to generate a mountain of completed tests that have a coin flip chance of being incorrect in extremely inconspicuous and hard to trace ways

[-]

congramist@reddit

🙄

[-]

potatolicious@reddit

Yep, just like if I handed it to an intern. Still far faster for me to review the resulting code (especially because I know the critical bits that it needs to get right) than to write it by hand.

Yeah, I get it, code review is everyone's least favorite part of the job, and this stuff will push us towards doing a lot more of it.

Ah, well. C'est la vie.

[-]

AchillesDev@reddit

My least favorite is meetings, code review is just fine by me. I only ever was annoyed by it because it took away code writing (not problem solving) time, but now I don't need to dedicate nearly as much time to that.

[-]

ZorbaTHut@reddit

oh no

reading code

how terrible

[-]

SomeEffective8139@reddit

I'm currently seeing this as my workplace. The management seems to be pushing AI tooling for productivity, but there is a very vocal obstructionist group of developers who have a moral objection to this and refuse to use it for anything.

I am pretty cautious when it comes to jumping on a new trend. I'm not a 22 year old vibe-coding "entrepreneur." I'm a forty year old with 15+ years of experience. Copilot is just fancy autocomplete. It helped me write 100+ unit tests for a feature I worked on this past month. I could have done that by hand coding but I would not have had time, and I would have probably skipped a bunch of scenarios out of necessity, so as far as I'm concerned the Copilot tool helped me improve my code quality and that is all I care about.

[-]

marx-was-right-@reddit

Copilot is just fancy autocomplete. It helped me write 100+ unit tests for a feature I worked on this past month

IDEs and basic templating could do this for over a decade. You arent breaking new ground

[-]

SomeEffective8139@reddit

Not nearly as well as Claude does. Claude is great for the completions. When I start typing a test case like "when I click the Foo button" it instantly lets me tab-complete the entire test scenario, with 95% accuracy. Nothing like that ever existed before. IDEs do some kind of templating bullshit that barely worked.

[-]

marx-was-right-@reddit

IntelliJ definitely give you everything except the inputs, which you could copy and paste. It also doesnt "hallucinate". That 95% accuracy number you gave is BS

[-]

SomeEffective8139@reddit

I very much doubt you have used either of these tools extensively if you think they are even in the same ballpark.

[-]

marx-was-right-@reddit

I most certainly have. I think the flip side is that you dont seem to be doing anything remotely complex, context heavy, at scale, or with guardrails, as thats where "AI" has fallen flat on its face for me every time

[-]

ZorbaTHut@reddit

It's fancy autocomplete, sure, but it's really fancy autocomplete!

[-]

Qwertycrackers@reddit

I keep trying to realize this fancy autocomplete and it just doesn't get there. My most recent foray was like this, I wanted to get Copilot to generate some tests that were going to be tedious to write. Primarily because I wanted to use extensive mocks, which I normally avoid.

The generated result was really impressive, and I thought this was a turn of the corner for AI tooling.

But then I continued and learned that copilot had made basically every mistake possible in those few hundred generated lines. By the time I had finished I had touched nearly all of them, and some of the mistakes were really sneaky and pernicious mistakes that no one would reasonably make when writing a test. Things like a test that elaborately ends up testing a tautology rather than the code under test.

Overall every attempt I make leaves me distinctly unimpressed. To be really useful to me it needs to at least sometimes write something that works, and I have yet to receive this result despite many attempts.

[-]

TheNewOP@reddit

Tried to get Copilot to update a sample response in a Markdown file for an API endpoint contract and it immediately hallucinated on me. If I can't even automate the most basic shit with it, what's the god damn point?

[-]

thephotoman@reddit

I had a similar incident where I was looking for set splitbelow to add to my .vimrc (a line that didn't get added to source when I last committed my .vimrc to a personal repo). Instead, Copilot spat out a bunch of NeoVim scripting instructions, presuming that I meant NeoVim when I explicitly said vim.

I spent a good 10 minutes attempting to get it to not give me NeoVim-specific instructions. It never complied, so I gave up.

[-]

false_tautology@reddit

This reminds me of the time I was trying to generate something for .Net Framework and it kept giving a mix of Framework+ .Net 8 and I couldn't get it to stop using 8.

[-]

fullouterjoin@reddit

It is an anecdote unless you find a way to explain what you did to someone skilled in aiprog. One still has to be able to write detailed expectations for the result you want. On that second pass where you are looking at the tests, you should annotate every mistake it made and then either a) have it do a second pass and fix it b) start a new context with the better instructions/examples and see how it performed. The prompts you write for this are reusable and they now form documentation.

The tone I get from your comment is that you are still trying to "take down" the AI.

[-]

ghost_jamm@reddit

Given the time necessary to go through all of the AI’s code with a fine toothed comb, annotate it, ask it to redo the work, then double check that work, this does not strike me as a massive productivity boost.

[-]

ZorbaTHut@reddit

ask it to redo the work

Nothing requires the AI to redo it. Sometimes it's faster to just make the fixes yourself.

[-]

AchillesDev@reddit

Have you actually done it? Even with revisions it's a huge time saver for most tasks.

[-]

marx-was-right-@reddit

On that second pass where you are looking at the tests, you should annotate every mistake it made and then either a) have it do a second pass and fix it b) start a new context with the better instructions/examples and see how it performed.

At that point its easier and faster to do it myself.

[-]

ZorbaTHut@reddit

Out of curiosity, do you know which model you were using? And was this with Github Copilot, or with something else?

The last time I needed tests I said

Look at the test I wrote here, this is for testing the various paths in [filename], go write the rest of the tests

Then it did it all wrong, and I sighed, reverted it, and said

Look at the test I wrote here, this is for testing the various paths in [filename], go write the rest of the tests, model them off this one

and it got them (almost) all right.

I do think there's some level of "understand how to talk to the AI", but I'm also curious just, y'know, what went wrong.

[-]

Qwertycrackers@reddit

Yeah I linked up github copilot with their vim plugin, since my company was pushing it at the time. I actually didn't have a good example of this type of test, which is why I was interested in getting an AI to generate something to start from. So the model is whatever github copilot defaulted to a few months ago.

I probably could have tormented the AI into doing what I wanted. But honestly I don't know why I would spend my time on that -- it did manage to generate a very flawed structure of what I asked for, so I guess it kinda saved me some time finishing the task.

[-]

ZorbaTHut@reddit

I honestly am not sure how good old-github-copilot is at this sort of thing; when I did it, I was either using Claude or Claude Code. I know Github Copilot is working at agent integration (in fact I've been using it for the first time literally today), but it seems not great, though maybe I just haven't figured out what it wants from me yet.

Also it's possible the vim plugin wasn't all that battlehardened :V

Anyway, if you end up trying it again, I recommend Claude Code if you want interactivity, or try it out in something more officially supported, or just copypaste stuff into GPT or Claude. One way or another it's always improving.

[-]

Qwertycrackers@reddit

Yeah I will probably keep poking at it different ways every once in a while, so I'll give your suggestion a try. I just think the marketing claims are pretty far out over their skis on this one.

[-]

alpacaMyToothbrush@reddit

As much hate as 'prompt engineering' gets, I feel like those who have extensively played with smaller, worse, local models are much more effective in getting what they want out of bigger models.

TLDR: You gotta give it context and examples

[-]

AchillesDev@reddit

Copilot

There's your problem

[-]

SomeEffective8139@reddit

It is but you have to take it with a grain of salt and I usually end up rewriting the code the AI produces. Asking it questions, the model is also wrong a good portion of the time. A comparison I've heard is that the AI is like having a pretty clever intern who can do the grunt work for tasks you've already fully defined but if you give them unclear instructions they will go do something crazy.

[-]

ZorbaTHut@reddit

A comparison I've heard is that the AI is like having a pretty clever intern who can do the grunt work for tasks you've already fully defined but if you give them unclear instructions they will go do something crazy.

Yeah, this is the analogy I use too. AI is an uncomplaining inhumanly-fast overly-ambitious novice programmer who has read every webpage on the planet and kinda-sorta remembers most of them.

There are a lot of useful things you can use that for.

Not everything. But a lot.

[-]

SomeEffective8139@reddit

That's a perfect description.

I also liken it to Wesley Crusher from Star Trek. The know-it-all overachiever who works really fast and gets things done but often goes too far and lacks good judgement and is overconfident in areas where there is complexity.

[-]

ZorbaTHut@reddit

Hah, yeah, that's pretty accurate.

There's a lot of jobs you can safely give to Wesley. There's also jobs that you want to keep him far away from.

[-]

SomeEffective8139@reddit

Do NOT ask Wesley to migrate the production database

[-]

ZorbaTHut@reddit

You can ask Wesley how to migrate the production database, but if you do so, be extra careful to ensure that Wesley does not actually have access to the production database.

[-]

potatolicious@reddit

Yep. Even just "can autocomplete a small block of code in-context" is a game-changer IMO. Like, something as simple as "oh you're flattening a dictionary into an array, let me autocomplete that for you with correct variable names and all that", while feeling small, has a huge impact!

Individually each time there's a successful multi-line autocomplete it saves me a few seconds... but multiply that over a day, a week, a month, and the impact is very sizable!

[-]

ZorbaTHut@reddit

More than a few times I've needed some reasonably basic utility function, written the function signature for it, waited after the { for a few seconds, and had it spit out the entire thing.

[-]

potatolicious@reddit

Yeah. The fact that the autocomplete isn't always useful seems like... not a problem? The status quo ante is I have to write all of it manually anyway.

And yeah, lots of simple idioms are much easier with a LLM. Sometimes just typing in a comment is enough. // Group list of entries by device identifier. and it spits out a simple chunk of code that does exactly that. And yep, simple functions too tends to be very good just by presenting an interface.

None of these things individually are universe-changing or anything, but in aggregate it makes me significantly more productive.

[-]

ZorbaTHut@reddit

And yeah, lots of simple idioms are much easier with a LLM. Sometimes just typing in a comment is enough. // Group list of entries by device identifier. and it spits out a simple chunk of code that does exactly that.

I keep getting annoyed at trying that trick and Copilot deciding it's easier to just keep writing eternal comments than write the actual code I want.

It works, like, 2/3 of the time, which is just often enough that I keep doing it and just rarely enough that it's a constant irritation.

[-]

Empanatacion@reddit

"AI isn't coming for your job. Somebody using AI is coming for your job."

I'm excited by how much this is flipping the table over. It's fun that all the rules are changing again and that I get a chance to pull further ahead of the people being stubborn about it.

I've always learned more quickly by having a conversation with someone that knows more. Yesterday Claude taught me in an hour what would have taken most of a day wading through docs that are 75% irrelevant.

And Claude doesn't condescendingly sigh and tell me to RTFM.

[-]

wwww4all@reddit

Claude may have given you quick results for now, but how will you retain any knowledge or experiences, if Claude does all the work?

I get the power tool analogies. But the real analogies are basically getting finished products, that you may add touch up paint. So now, you don’t know how to use hand tools or power tools.

[-]

CarousalAnimal@reddit

Jesus, you out coding in a battlefield or something?

[-]

alpacaMyToothbrush@reddit

During the war on terror I did see a devops contract position on a FOB, which was pretty unusual. The thought of making commits under incoming mortar fire made me chuckle

[-]

congramist@reddit

lol can you imagine? Shit exploding left and right, trying to focus through it, you are pinned down with no way out, and then Gary from bizdev walks into your foxhole and asks if you know how to fix his grandmother’s home photo printer.

[-]

ZorbaTHut@reddit

I mean, to some extent, all of this is a competition; if you are (picking number out of a hat) ten times as productive as everyone around you, congratulations, you are now worth a lot of money. This is less true if everyone competing in the same market or for the same employer is ten times as productive.

Despite this I'm still happy to give advice, but if someone's response to the advice is "omg AI? slopcoder incompetent can't think for yourself" my response is going to be "Okay, whatever works for you then!"

[-]

AchillesDev@reddit

BuT yOu'Re JuSt A vIbE CoDeR

[-]

ZorbaTHut@reddit

I honestly am once in a while now; I needed some surprisingly complicated chunk of code, but with a very simple interface, and it would never be shipped to customers. So I just kinda vibe-coded it out. Ended up at 700 lines of code.

I verified that it was giving me the right output, and I kinda skimmed it to make sure it wasn't doing anything extremely dumb (and removed some error handling so that if there is an error I just get an exception thrown up through the stack), but that's about it.

I wouldn't do that for anything I was shipping outside of my own Jenkins instance, but sometimes it's appropriate.

[-]

fullouterjoin@reddit

I have had many friends get into combative, "convince me", style arguments.

Talking to the royal you, the combative AI skeptic that won't do any open minded curious research and play...

I don't have the energy or inclination to spend mental and emotional energy to drag you to the trough. I just showed you how I use it and what it can do. If you can't be bothered, why should I tutor your ass for free, esp when you are fighting it the whole time.

My response is now, "It's like my opinion man. You can give it a try" I don't push it, at all. It is weird tho, I stop sharing my discoveries with these people.

[-]

thephotoman@reddit

I only demand reproducible evidence when someone makes a suspect claim. When a person attempts to quantify how much more productive a tool makes them, I want to know how they got that number. Most of the time, they got that number from somewhere in their lower digestive tract, as productivity is too poorly defined to measure quantitatively. All quantitative claims of productivity benefits deserve skepticism.

Generally, I'm not sold on arguments from productivity. I don't see an actual benefit from being more productive. I don't get paid by the story point or the feature delivered. And any attempt at quantifying productivity improvements is going to dash itself against the rocks of defining productivity well enough to measure it. Promotions? I'll get a promotion when I go to my next job and not before.

This is not a rejection of AI. There are tasks that I would relegate to AI if they were problems I have. If I were assigned to refactor some legacy code without unit tests, I'd likely turn to AI to autogenerate unit tests. But I don't really work with legacy code much right now. If I saw that it was actually an improvement on a Google search with "site:stackoverflow.com", I'd use it as such (and I do use it to generate some examples if I'm still not quite clear what the Stack Overflow post is on about). But it is a rejection of the AI hype. If you want to attach numbers to how AI makes things better, you'd better come with a source for that number.

[-]

dodgerblue-005A9C@reddit (OP)

I believe you did a better job of articulating this. There's no polite way of saying you suck at bare minimum that you need a machine to help you but all i'm trying to talk about is the higher value stuff up the chain is shit and the narrative is shittier!

[-]

dodgerblue-005A9C@reddit (OP)

"My choice to use one or not is personal", i absolutely agree with you, we're free to choose our tools
"I genuinely don’t think I could convince you...so I typically don’t bother", a good product/tool doesn't need convincing, or publish bs articles of sentience or going on msnbc / cnn crying wolf about people loosing their job.
this was never about job rejection and blaming ai for it, if it appears so, i apologize, it's the part of community's blind faith and the push of this narrative is what i'm ranting about
i'm trying to gauge what others think and also point out fallacies such as yours of "axe" vs "chainsaw", my argument is about the dishonesty, compliance and lack of critical thinking

[-]

menckenjr@reddit

Okay, I'll chime in on it. I think LLMs have their uses, but I also think they give management too much of an excuse to mandate their use even in inappropriate areas. If you're going to use them for much more than rubber-ducking or autocomplete you'd better know what you're doing; if you don't, or if you're really new you'll need to double-check nearly everything you get out of them to make sure you aren't just pasting "hallucinations".

[-]

congramist@reddit

I half agree on most points. AI is obviously overhyped, but I didn’t need any convincing to try it and continue to use it for development. It works well enough to make it worth using. I can posit that while also agreeing that the people selling it are selling for more than it is worth.

I don’t think anyone here who is actually experienced has blind faith in a tool. I certainly haven’t seen that much here.

Claiming that an analogy is a logical fallacy with zero reasoning is an interesting way to dismiss a point.

[-]

dodgerblue-005A9C@reddit (OP)

chainsaw = llm/ai

axe = ?

tree = problem solving?

this is not a react vs swelte, rust vs golang argument, i think the conversation drifts a lot if we get into these nuances

my argument is our compliance to status quo and not critically think about these tools

[-]

congramist@reddit

Axe = tools we had before the introduction of LLMs into mainstream development processes

Cutting tree = doing development

You are all over the place my guy. I mentioned that we have to consider the usage and maintenance associated with using a chainsaw in my analogy.

Like I said, I couldn’t have convinced you anyway.

[-]

Empanatacion@reddit

Your impression of the developer community is that it has blind faith in AI? This sub has a hate boner for it.

I think folks hear the ridiculous claim that AI is going to take our jobs and then lump it together with "this is a very useful tool".

If "slightly faster than typing" is the most use you are getting out of it, then you're not trying very hard.

[-]

adambjorn@reddit

What an excellent analogy, Im stealing it forsure

[-]

potatolicious@reddit

+1 on this. I'm bullish on this tech when it comes to improving software development. I am far less bullish on the cult-y aspects (AGI, the Machine God) or the sci-fi automation aspects (your personal butler-bot! the robo-developer that turns a vague product description into working code!).

This stuff is both incredibly overhyped but also profoundly disruptive in a way that, as members of the field, we need to pay attention to.

I am markedly more productive even with really minor inclusion of LLMs into my workflow. Most recently I've been working deep in the guts of AOSP (the Android OS itself), banging my head against a weird problem that was impossible to diagnose. I asked my very human coworkers - some of whom wrote the damn OS for years, and nobody knew either. After a few days of fruitless debugging, it occurred to me that I never asked the LLM.

Note that this isn't Cursor, or some deeply-integrated AI workflow. I literally just booted up the Claude app, and prompted it with the symptoms, and what I've already tried. It came back with 3 suggestions on possible causes, each pretty obscure. Lo and behold one of them was it. I could've saved 3 days of head-desking if it occurred to me earlier to just type it in.

Ultimately these things aren't truly "smart" in a way we understand "smart", nor does it actually "reason" or "think"... but yet you can coerce a ton of useful work out of it, and that's all that really matters.

[-]

TangerineSorry8463@reddit

I'm fearing that AI will enable the mediocre people to pose as experts much more confidently, to a level where a side observer will not be able to tell a difference.

[-]

thephotoman@reddit

Hi, I'm one of the tenth guys.

I usually don't get so opaque about "it's worth using as a tool, but be careful because it can stop skill growth". I'm quite clear that it's a barely adequate replacement for Google your question with "site:stackoverflow.com". It's great at providing examples when Stack Overflow's answer gets a bit heady and theory heavy.

Skills are built through practice. You need to do the typing exercise. It's a part of the process of learning.

[-]

alpacaMyToothbrush@reddit

I wrote up this comment on the subject yesterday and I'm not gonna rehash it here, but I pretty strongly disagree with the idea that 'progress has plateaued'. If you think that, you haven't been paying attention. The 'headline LLM models' might still have flaws but the rate of change in AI overall has absolutely not slowed. If anything, we're reaching a stage where changes and improvements are starting to compound.

I kind of handwaved away the 'AI 2027' paper, but I've noticed that even more critical voices on AI have moved their predictions forward, and even what we have today will be pretty damned disruptive as it diffuses through the economy, and this is the worst it will ever be.

TLDR: I am equally as critical of the folks that blindly trust ai as I am of my fellow grey beards that insist this is nothing but a bubble. Both are wrong, but the stary eyed optimists are less wrong than those sticking their head in the sand.

[-]

SomeEffective8139@reddit

Well, I'm a real developer who's been using Copilot for the last 18 months or so. I have 15+ years of experience. I transitioned back to full-stack dev and it's been really useful to help me learn React and become productive quickly. The models are getting a lot better, but are still far from perfect. I just view it as fancy autocomplete. Chatting to the agent is also a nice way to do "rubber ducky" debugging without an actual human.

[-]

daishi55@reddit

I’m the 10th guy. Not really interested in convincing anyone else, better for me if I’m one of the few who’s good at using these tools.

In a broader sense though I think it’s bad how many people are in denial about the social and economic changes that are coming.

[-]

rajohns08@reddit

Out of curiosity, what agent and model do you use?

[-]

RighteousSelfBurner@reddit

It's a solid advice for "beginners". It's basically the same as the old "build your own project at some point instead of just following the tutorial"

As with anything that provides results it's easy to skip the understanding and learning part.

[-]

agumonkey@reddit

as it can stop your skill growth

it will also make bad devs more "productive" so the high achievers will lessen in value

i'm already seeing changes in my psychology regarding learning, everytime I have a problem, I hesitate in using chatgpt or reading.. and it takes an effort now.. not far from the feeling of doomscrolling or not.

[-]

sevvers@reddit

It's a constant thing everywhere. AI derangement syndrome. If you're not a 100x developer you just haven't ~~tried the right strain~~ used the right prompts yet.

[-]

sevvers@reddit

Like as professionals I thought we had come to some pretty solid conclusions: * LOC is a bad metric * Slinging tons of code does not equal being productive * Writing code is the easy part * Code is a LIABILITY * Good engineers understand how systems work under the hood / don't program by coincidence

All these "axioms" and LLMs come along and overnight you're unemployable if you're not babysitting an autocomplete app to puke out tons and tons of code? Give me a break.

[-]

HaMMeReD@reddit

Eh, I wouldn't even say it'll stop your skill growth. It'll grow new skills while leaving old ones to erode.

It's not like LLMs do all the work, they are garbage in, garbage out. The skill is knowing how to provide high quality inputs to produce a high quality output.

[-]

ListenLady58@reddit

It’s literally not a black and white thing, but both pro and anti AI people seem to talk like it is. AI is great to use to help speed things along, not to completely replace the developer or engineer. That’s the only reason I use it. If I forget how some syntax formatting, that doesn’t mean I forgot how to program. It’s faster googling for me basically.

[-]

JaneGoodallVS@reddit

The experienced engineers in my private iMessage chat have begun letting non-devs fix bugs

[-]

ListenLady58@reddit

The devs gave access to the codebase to non-devs?

[-]

JaneGoodallVS@reddit

Yeah

[-]

marx-was-right-@reddit

Anti AI people arent against it being a faster google.

Theyre trying to combat the extremely dangerous claims that this can replace an actual human and provide business value as an autonomous object. Theyre also against the environmental aspects.

"Faster googling" and Ghibli pics shoulnt be using 2% of the worlds power.

[-]

ListenLady58@reddit

Well there’s a lot of anti-ai people here who like to dump on people who use it for faster googling claiming that people become dumber because of it. So those are the anti-ai people I am referring to.

[-]

thephotoman@reddit

There are people getting dumber because of AI.

There are people using it for faster Googling.

While there's some overlap between these groups, there are plenty of faster-Googlers who clearly aren't getting dumb. To the extent that AI is "making people dumb," it's that it makes the value of learning some things (that might be regarded as important based on one's values) lower.

My question really is whether speed is my problem with Google. It isn't. My problem with Google is the amount of extra crap on the page with my search results. My problem with Google is that if I Google something, it's going to try to sell that thing to me, whether I want to buy it or not. But trading that and accuracy for speed is not a bargain that I am willing to make.

[-]

ListenLady58@reddit

I meant it’s faster in the sense that I don’t have to click through all of those unhelpful links in the search results that you mentioned.

Saying AI is making people dumber is the equivalent to saying the internet is making people dumber. It’s an oversimplified assumption that doesn’t take into consideration that many people use the AI (similarly to the internet) to explore, learn and up-skill. If you don’t want to embrace AI, fine that’s your life and prerogative, but I don’t think AI going anywhere. Nor do I think software engineers and developers stomping their foot about it is going to stop companies from requiring their employees to use it. As we all know, companies don’t give a shit what their employees think. And as they say, if you can’t beat them, then you may as well join them.

[-]

thephotoman@reddit

I meant it’s faster in the sense that I don’t have to click through all of those unhelpful links in the search results that you mentioned.

You were actually clicking them? I could usually tell from the summary if the page was what I actually wanted.

Saying AI is making people dumber is the equivalent to saying the internet is making people dumber.

Some people are using it to get dumber. For example, I've seen ChatGPT astrology readings. I'm pretty sure that qualifies as "using AI to get dumber". But not everybody using it will get dumber.

I do not believe that AI is "similar to the Internet", and if it is, we're in the Dot Com era of it--before the initial hype gives into the reality of what the thing actually is. Maybe it's similar to Windows 3.0 if it represents a UI revolution (I like the idea of conversational user interfaces).

And as they say, if you can’t beat them, then you may as well join them.

I am consistently beating them without AI.

[-]

BlazeBigBang@reddit

Some people are using it to get dumber. For example, I've seen ChatGPT astrology readings. I'm pretty sure that qualifies as "using AI to get dumber". But not everybody using it will get dumber.

People have been using ad-hoc, non-AI websites for astrology and whatever mystic bullshit you can think of for years. It's not new that morons have access to technology and that they use it for stupid ends.

[-]

maigpy@reddit

also the next search on Google will lose context. there is no conversation.

[-]

thephotoman@reddit

I don’t care about that, honestly.

I’ve basically never been in a situation where I wished my Google searches had more context. I don’t want to talk to Google. I want my search results.

[-]

rustyhere@reddit

This. I am always wary of people who are either hyping AI so much or completely against it. Truth is always in the middle. It can increase productivity if you are working with multiple stacks at once and you forget the syntax/semantics. Or sometimes it’s faster than googling for the library docs. You still need to verify if the output is correct for sure. It doesn’t mean that you are letting AI take the driver wheel. You are initiating the implementation by your own coding skills. To say that it’s good enough to replace a programmer is a far fetched theory based on what we’re seeing.

[-]

FortuneIIIPick@reddit

> ai is a minuscule reason for layoffs

Wrong, AI leads decision makers to falsely believe they can lay off more of their work force than they should. That puts it at the forefront and it has since 2022.

[-]

domo__knows@reddit

https://fly.io/blog/youre-all-nuts/

^ Pretty much that. I was an AI skeptic until 2 months ago. I have since seen the light and it's both exciting and very scary.

Yesterday I built some family tree software. My strength is modeling the data but I'm weak at any JavaScript that's not just some CRUD app. I asked Claude for some improvements on the data model and it gave me some great tips for structuring the tree. Then I fed it the JSON output of my tree and after 30 minutes Claude gave me a good-enough tree built in React. Then I fed this React component into cursor and asked it why the lines were so shitty and it generated better CSS and SVG to make the lines cleaner. My goal is to give everyone in my family access to the tree so they can upload photos, leave memories, write bios for dead family members I never got to know. Maybe add a public family wall or something.

It's not that I'm too dumb to figure this out all on my own but AI has been an extension of my abilities and filled in a bunch of my weaknesses. It writes all the very, very boring parts of the code that would've taken me days to do, if not weeks because I also have a life to live and getting started is often the hardest part.

Of course this is just a pet project but I can see now what a developer with deep understanding of a stack + domain knowledge + a little time can do. And the tools just keep getting better and better.

You can get on the train now or later but suffice it to say we're all getting on this train eventually if we decide to stay in tech, so why not get familiar with it now?

[-]

wwww4all@reddit

Do you understand the vibe code parts of the app you created?

That’s the crux of the issue. Yes, you can crank out lots of code with AI tools. Some may work.

But what have you learned to improve tech skills?

[-]

domo__knows@reddit

I absolutely understand the code it wrote. In the same way, it's like writing vs editing. It is extremely hard to write from a blank page but when you're editing someone else's writing it's a lot quicker. The sentences may be terrible but at least there's something to work with.

Good example from my 3 hour session yesterday working on my family tree: I thought the best way to model the relationships was to create two models, DomesticRelationship and ParentalRelationship, to map relationships between individuals in my family tree. I asked cursor what could be improved and it told me that I could simplify the two models into a single Relationship model and showed me how. After getting my tree to work, the organization was weird and Claude told me that genealogy software often uses a concept of a "Family" or "Union" to group people together. So I told Cursor to create a Relationship model and re-write everything using the Relationship model, create a Family model, create a data migration, run it, amend all the scripts that created my tree, then fed the JSON back into Cursor and told it to recreate the tree in React and the groupings were much better.

Of course this is fairly rudimentary but this would've taken me a few hours to do all that refactoring and it did it in about 20 minutes. I have so many other examples though of Claude helping me out. DevOps is not my specialty and without AI I would not have been able to build my CI/CD pipeline.

And just to note, I am extremely skeptical and laugh at anyone who thinks they're going to create anything of substance having learned how to program a month ago or the corporate types who are salivating at the prospect of firing all their engineers. But as the blog I linked to mentioned, AI is a tireless assistant that gets it mostly right. It really is an extension of my own abilities. I know when it's feeding me shit, when it's feeding me gold, and my own domain experience allows me to tell the difference between the two.

[-]

hippydipster@reddit

improvements have plateaued

This is pure delusion. Pace is accelerating, not plateauing.

[-]

marx-was-right-@reddit

Seems like the tools are significantly worse than 2 years ago to me, likely due to training data being ass from all the "AI"

[-]

hippydipster@reddit

That's what I mean by "delusion".

[-]

marx-was-right-@reddit

If its "accelerating" then why are the AI companies spending over 3x their revenue with 0 flagship products or use cases that will take them to profitability?

[-]

hippydipster@reddit

There's no "if" about it. The acceleration is seen in all measured results. Why are they spending so much on it? Because acceleration is seen in all measured results.

Adoption is a different matter, and GPT-4 is only a little over 2 years old, and has nothing to do with the visible progress.

[-]

marx-was-right-@reddit

The acceleration is seen in all measured results. Why are they spending so much on it? Because acceleration is seen in all measured results.

Measured results of what?

Providing the very best AI to all companies is beyond our energy budget as a world, but, let me repeat, that has nothing to do with the visible progress.

Progress implies usefulness in a business context. The toothpaste has left the tube on the "this is all just research and nonprofit" angle.

[-]

hippydipster@reddit

Benchmarks, of all sorts, including things like testing people's ability to distinguish human vs AI art.

[-]

officerthegeek@reddit

how does AI art indistinguishability help me when coding?

[-]

hippydipster@reddit

Do you not know the meaning of "including"?

[-]

officerthegeek@reddit

I do, but why would you use that as your example instead of something that could actually prove your point?

[-]

hippydipster@reddit

You mean like "Benchmarks, of all sorts"?

[-]

officerthegeek@reddit

1) I don't care about benchmarks of all sorts, I care about coding benchmarks, because that's what the discussion is about 2) Just because you say there are benchmarks of all sorts doesn't make it believable. Just mention how AI usefulness in coding is benchmarked and how that has been improving and you literally win the argument

[-]

hippydipster@reddit

I care about coding benchmarks

So feel free to check them out? They too are included.

Just because you say there are benchmarks of all sorts doesn't make it believable.

Fuck me. I assumed you had some knowledge of anything at all going on in the world of programming and AI.

[-]

officerthegeek@reddit

they too are included

Included in what? You only mentioned indistinguishability benchmarks. If you know the specific benchmarks relevant to coding, why not mention them in the first place? Because if I see a general statement like "there are benchmarks" with no specifics, instead of doing your work for you and looking up those benchmarks, I'll just assume you're talking out your ass

I assumed

Doesn't matter what I know or don't know, you're making an argument, but you're doing it in a dumb incomplete way and getting pissy about it, even though you clearly care about making a good argument because you're bothering to argue on the internet. So please just help me understand your arguments by expanding on your answers like a normal human being rather than talking back like someone whose age is half that of your claimed YoE

[-]

maigpy@reddit

ALL benchmarks have been getting better across the FIELD and for a while.

if you don't know that you are not up-to-date enough to take part in the conversation nd should just read and inform yourself before commenting.

[-]

officerthegeek@reddit

sure, cool, ok, what benchmarks in specific should I look at?

[-]

maigpy@reddit

https://www.perplexity.ai/search/see-the-convo-give-this-guy-so-p2cg1uTmR62NEn8zc9UTYA

[-]

officerthegeek@reddit

on one hand, thanks for actually bothering to expand the argument rather than just complain about people not knowing what you know. On the other hand, it's very easy to interpret this as you not knowing those benchmarks yourself, if you had to rely on AI

[-]

maigpy@reddit

I knew some of the info in that reply, but I can't beat perplexity AI. it would take me 1 hour to research and prepare such a response. that's another ai tool that adds tremendous value.

[-]

officerthegeek@reddit

but frankly speaking you don't need a response like that. literally just saying the benchmarks you knew would be enough gives a more authentic and more-likely-to-be-read response

[-]

hippydipster@reddit

Sorry, left my Gerber spoon at the nanny's.

[-]

officerthegeek@reddit

genuinely why argue on the internet if you're not willing to explain shit?

[-]

marx-was-right-@reddit

That has absolutely nothing to do with a sustainable business model. You cant just light 50 billion on fire and use as much electricity as an entire country to make shitty art.

[-]

hippydipster@reddit

We're talking about progress of AI capabilities. Go back to the top of this thread where I first responded - you seem to have forgotten the topic.

[-]

marx-was-right-@reddit

Im not sure "progress" means the same thing to you as the rest of the world. Benchmarking shitty art is a complete smokescreen for how useless and wasteful the tech is

[-]

maigpy@reddit

there benchmarks of all types, don't get fixated on this one thing because you have no other arguments.

[-]

hippydipster@reddit

shitty art

Second time you've used this phrase. Seems clear you have a bone to pick, so I'll let you do that on your own time.

[-]

marx-was-right-@reddit

You just brought up recognition of AI art as the primary example of AI "benchmarking", lol.

[-]

maigpy@reddit

lol shitty art. stock photography has finished to exist overnight. soon advertising will be next. translators, copy writers, authors of all kind.. software engineering, yes, I am so much more productive.

models becoming smaller and easier to run as speak..

[-]

ThisGuyLovesSunshine@reddit

This just tells me you haven't used AI and have no idea what you're talking about. Delusion is correct.

[-]

thephotoman@reddit

You started with an ad hominem, then moved on to another assertion made without evidence.

I don't know if improvements are accelerating or plateauing. I know that as an end user, I'm still deeply underwhelmed by AI. It's still a tool that I just do not care about, and the trials I'm giving it--which are typically how I begin integrating a tool into my workflow--are going so poorly that I'm giving up more often than not.

[-]

hippydipster@reddit

I don't know if improvements are accelerating or plateauing.

There's a lot of benchmarks out there to check out. You don't have to just sit there not knowing.

[-]

thephotoman@reddit

Let’s engage in a debate for a bit. I’m performatively going to engage in skepticism with you so that you can make your point better than you started.

One thing that I do know is that benchmarks aren’t always a great measure of real world performance.

It’s also quite possible for a credible benchmark to turn out to be useless in reality, because we didn’t fully understand what we were actually looking for.

When a gamer sees improved system benchmarks for new hardware, he has an understanding of what those benchmarks translate to in his experience of playing the game.

If users are telling you one thing while the benchmarks are saying another, it is far more likely that the benchmarks are bad. And in this thread, you’ve been getting a lot of people suggesting that, for their use case, they aren’t seeing an improvement they can feel (and feels over reals is the reality of user experience).

You need to make an affirmative case here. How have these benchmarks led to an improvement in the code output of LLMs? Show me the data.

You’re not here to convince me. You’re here to demonstrate to the audience (this conversation is public, engage with the marketplace of ideas) that OP is wrong, and AI will…I don’t actually know what you think you want from AI. Make your case.

[-]

hippydipster@reddit

It's not a debate. You are free to check out the state of the world anytime you like.

[-]

thephotoman@reddit

I am looking at the state of the world.

And I am not convinced. I've found generative AI to be at best a poor replacement for site-specific search on Google and a demonstration of the general lack of knowledge about the code generators available. "But a code generator can't write my unit tests for me!" Yeah, that's because that's a bad practice. TDD is the best practice. Write your tests first, that way you know that what you wrote is right. You might then have the AI write code to pass those tests, but I don't think you want to do that. That's the fun part, and it isn't a particular drain on my productivity to type it out myself. It's the easy part anyway.

I'm watching my coworker spend 30 minutes crafting a Python script with AI and claiming it a productivity booster when I turn around and whip out a shell one-liner that does the same thing. I'm watching people use it in place of a more-reliable deterministic code generator that was already in their IDE. And I'm quite worried about what happens when the AI companies have to start turning a profit (because they haven't yet). I'm watching people turn "writing the code that does the thing" into "debug a bunch of code written probabilistically"--making the job harder, not easier.

[-]

hippydipster@reddit

You seem to have forgotten the question. The question is, is it plateauing or accelerating (or just continuing to progress). You seem to have instead gotten stuck on a question that wasn't asked: "can /u/the photoman use AI productively right this moment?"

[-]

thephotoman@reddit

And I’m showing that whatever “gains” it’s making are not showing up in user data.

We’re not seeing significant improvements in job relevant tasks. Nobody is. They’d be talking about how much better things are. It’d be a compelling case for tool adoption with real world data to support the assertion in addition to benchmarks.

[-]

lovest6@reddit

Agree. Either these people don’t know how to use the tools properly or they’re just delusional.

I use these tools daily at work and the improvement is staggering. I can code maybe 10x faster with AI with reliability trend going upwards.

For example, Claude 4 is so much better than 3.7 on consistently outputting good code.

[-]

InvestigatorFar1138@reddit

If you are working only on greenfield projects in which 90% of the work is boilerplate, or if you were not a SWE before, maybe. If not, 10x productivity gains with AI for a seasoned engineer working on complex problems is as delusional as no gains at all. Most of what AI outputs is still bad code (or just wrong). Midlevel+ engineers should be able to see where it falls short and make corrections to the output, but that costs time and cancels most of the perceived productivity gains. In a lot of tasks I find it actually slower than just writing the code myself.

[-]

maigpy@reddit

20 years of experience here, ai making me 10x more lethal across the board.

[-]

thephotoman@reddit

If you believe that "90% of the work is boilerplate" on a greenfield project, you must not be familiar with code generators. They're static tools that you can use to automate the process of writing the boilerplate. Most IDEs include code generators like this.

In fact, my knowledge of my IDE's code generators have really limited the usefulness of AI in my workflow. Between that, my knowledge of how to shit out a shell one-liner for any purpose, and overall TDD-based workflow, AI doesn't really give me anything I need.

[-]

farazon@reddit

I keep seeing this "AI good on greenfield" argument and it totally baffles me, because I experience the exact opposite. Greenfield, you need to tightly detail prompts and watch for it going off in a completely wrong direction. In a mature codebase where to deliver a feature I need to, e.g. write a job wrapper over my code, plus an API, plus a reader/writer, etc - that's where I find AI really shines. I focus on writing the interesting parts, and offload the boilerplate stuff to AI, promtping "look at this example package, and do everything the same, incorporating changes x, y, z".

[-]

InvestigatorFar1138@reddit

I don’t think it’s particularly good on greenfield either if you just let it loose, but it is the type of work that I find myself writing the most boilerplate and/or repetitive code, and it does work somewhat better because I can use existing projects as an example.

I agree that the type of task you described is also a good use case, as is any task that is somewhat similar across different resources but different enough that you can’t really DRY it. Frontend tables/forms are another example that I’ve gotten a productivity boost. Regardless, once you have good templates for the repetitive parts, the meat of the work is writing the core business logic which AI sucks at IMO. AI is faster than copy/pasting templates and changing types and fields on them manually, but not ludicrously so that I would claim anything over a few % overall productivity gain.

[-]

lovest6@reddit

I get what you’re saying, that was my experience a few months ago. Things have been evolving rapidly and I don’t share the perspective of most people in this discussion.

I am a seasoned dev, 17y of exp and I work in large codebases. My take is that people that downplay these tools wasn’t able to get to a productive workflow with them yet.

[-]

DigmonsDrill@reddit

People have been telling me on reddit for years that things have stopped improving in AI. People will insist they are at an (unnamed) AI company and have had no progress for months. A week later something new comes that can solve known challenges.

Remember, most of what you read on the Internet is written by insane people.

It's too bad, because I agreed with many points OP was making, but it seems like they just piled on whatever anti-AI arguments they could find without evaluating their correctness.

[-]

ryanstephendavis@reddit

There are clean data and energy consumption limitations that these LLMs have hit pretty hard to keep getting better... I disagree

[-]

hippydipster@reddit

That's a theory - it has yet to show up as a measurable stall in progress.

[-]

eaz135@reddit

The great thing about AI having so much hype and attention, is that we don't really need to speculate much about the capability, there's already well thought out benchmarks such as SWE-bench that attempt to quantify the progress AI is making with software development.

SWE-bench is essentially a benchmark that utilises a bunch of Python Github issues, and runs a suite to see how a model performs in creating PRs to resolve the issues (which can be judged by unit tests that were previously failing now passing).

There's an interesting refinement done in collaboration with OpenAI to create a "Verified" version of the benchmark, a human curated set of the issues that are determined to be suitable for usage in the benchmark for evaluating AI (i.e filtering issues out that the AI would realistically never be able to solve). Details can be read about the Verified set here: https://openai.com/index/introducing-swe-bench-verified/

The above post shines light onto things that many already intuitively suspected from playing with the tools - but didn't have the data/evidence to back it up. What you will find from the post and benchmark results is, whilst some of the headline numbers look great, when you break down the issues by difficulty (easy, medium, hard) - the models tend to do quite well with the easy problems, but start to really struggle with medium and hard problems, and essentially totally fall over when the issues themselves contain ambiguity (where in the real world the engineer assigned the task would be having conversations with various people to clarify the situation).

Problems were categorised into the easy, medium, hard buckets by humans - by estimating how much time would be needed by a senior engineer to solve the issue and have a PR ready.

That is the current state of affairs, we can see that over time the models have indeed been improving, and we are starting to see genuinely impressive numbers posted, but the reality is that they still struggle when it comes to: real world problems that contain ambiguity that needs to be clarified (i.e the full set of items prior to being filtered into the Verified set), or difficult challenges that requires actual critical thinking and real reasoning and understanding causality.

[-]

Which-World-6533@reddit

There's still no quantifiable results here.

I've see these AI's try and solve issues. It's horrendously bad.

[-]

eaz135@reddit

What do you mean there aren't quantifiable results? Quantifiable results is literally the purpose of the benchmark. Simply go to https://www.swebench.com/index.html and select the tab for which issue set you want to look at (Lite, Verified, Full)

[-]

Which-World-6533@reddit

Those results are dire.

There's no way I would hire a Dev who can only solve 60% of tickets.

[-]

maigpy@reddit

not even if the developer works for free?

[-]

Which-World-6533@reddit

Why the fuck would a Dev work for free...?

[-]

maigpy@reddit

take your time to re-read and apply some paid developer logic.

[-]

Which-World-6533@reddit

Ok, mate.

[-]

Master-Broccoli5737@reddit

Good job chatgpt

[-]

eaz135@reddit

What, lmao... My writing style is nothing like ChatGPT

[-]

another_account_327@reddit

Really wondering if the other users didn't notice they're replying to a bot, or if they are bots too.

[-]

Master-Broccoli5737@reddit

The internet is good and cooked

[-]

dodgerblue-005A9C@reddit (OP)

the benchmarks look good in principle but given it's from openai, i wouldn't personally trust it not because they're incompetent but there's a conflict of interest.
also given small no of independent evaluators, we've no way to prove they're not being bought off behind the scenes. dev influencers are bought by the dozen and they're transparent as a brick.

[-]

eaz135@reddit

The benchmark isn't from OpenAI. The OpenAI collaboration was around reducing the original set of issues uses in the "Full" benchmark, down to a "Verified" set set of challenges - which is a set of issues that have been curated by humans as being realistic for AI to solve.

So you see the benchmark results here: https://www.swebench.com/index.html - select the tab for which issue set you want to look at (Lite, Verified, Full)

[-]

dodgerblue-005A9C@reddit (OP)

you're right but look at https://www.swebench.com/contact.html
first guy is Carlos E. Jimenez - AI and ML student, couldn't find the 2nd guy but i'm confident he would belong this niche

do you not see the bias here?

[-]

eaz135@reddit

You see bias / conflict-of-interest because someone studying AI and ML built a tool to benchmark AI models?!

Its all open source:
https://github.com/SWE-bench/SWE-bench

You can do your own runs/evaluations, view logs and results yourself if you have doubts:
https://www.swebench.com/SWE-bench/guides/quickstart/

[-]

dodgerblue-005A9C@reddit (OP)

ok, so i looked at their codebase and their idea in principle appears fair, but looking at this SWE-bench/SWE-bench/swebench/inference/run_api.py, they directly call respective service providers.

their evaluation criteria is given a pair of issue and it's fix patch, they're comparing how close is the model generated code to the actual patch.

i see following unknowns:

there's no evidence the model just didn't access the internet to regurgitate the same/similar patch
choice of the repository
choice of the issue and patch

just to think of few on top of my head

the idea is to gain trust, any benchmark needs to be more scientific and has to stand the rigor. you need only one argument to falsify your hypothesis.

this is still a hypothesis

[-]

Fantastic_Elk_4757@reddit

At what point would it be accessing the internet? On the model side?

This can easily be tested by yourself… it’s not accessing the internet for information unless you go fetch it yourself and provide it.

[-]

maigpy@reddit

there is no way you can make such a statement for black box services like openai. they could be calling your mum in the course of servicing the request.

[-]

Fantastic_Elk_4757@reddit

Do you also think evolution isn’t really supported because all the people involved in researching it are - scientists?

[-]

dodgerblue-005A9C@reddit (OP)

That's not the question right, there's a gap in communication, "research" here is a hypothesis, it's an opinion. It should be seen and advertised as one. If any research forwards my position, i've an incentive to not resolve the ambiguity, keep the language vague. It should have no more authority than that of opinion. It becomes a fact when it stands the rigor of time. But somehow a benchmark appears to have induce more confidence because not everyone can understand or has the patience to vet it

Everyone is entitled to an opinion not facts

[-]

Crannast@reddit

SWE-bench is a nice benchmark, but it's also not completely representative of Software Engineering as the name claims.

It has a limited and simple sample (only open source Phyton projects). Many of these issues are also old and the training data of many LLMs have already been contaminated by them. The issues are small and self-contained mostly.

This is a MAJOR problem with AI companies and the reporting of AI results. People see "70% on SWE-benchmark" and assume it can do my job 70%, while in reality my job doesn't overlap at all with this benchmark. AI companies are more than happy to propagate this misconception.

AI benchmarking is a disaster in general.

[-]

eaz135@reddit

Yep, mostly agreed - and that is what I wanted to highlight in the Open AI post about the Verified set. The post very clearly states that the current state of affairs (at least specifically with these Python scenarios) their models perform decently with easy problems, but fall off quickly as things get more challenging.

The vast majority of software engineering jobs these days aren't about getting through piles and piles of easy problems - its about navigating ambiguity, using reasoning and critical thinking to design solutions, and delivering work that makes sense in the broader ecosystem of the codebase and wider systems. Then when it comes to building tools that are user-facing in any way - there is the whole aspect of the design and execution in such a way that its usable and makes sense for a human.

I think that SWE-bench is a great initial concept in the effort of quantifying how the capabilities of these tools are advancing, but I do agree that the site (especially on their main homepage that lists the benchmark results) could do more context setting (i.e the nature of the python issues), breakdown of issue types, breakdown of issue difficulties, etc. Just rattling off the headline percentages without that context can definitely be interpreted the wrong way.

[-]

daedalus_structure@reddit

In every single due diligence conversation I have either been a part of or have had access to the details, the investors are asking how small the team has been made due to AI.

The folks who are about to get more work are the security folks.

AI is software that can be social engineered, and there is no longer a clear distinction between safe input and unsafe text with escape sequences and code instructions.

You have cause to be worried, because the people investing hundreds of billions into AI aren't doing it so you can generate cat pictures.

They are explicitly doing it to eliminate software engineering salaries in the profitability equation for delivering a software product.

Anyone who doesn't see that is just as foolish as the people who invented it.

And for all the arrogance and self congratulations about how smart we all are, I've never seen a plumber be so fucking stupid that they would build a tool that replaced plumbers.

[-]

PoopsCodeAllTheTime@reddit

Yeah but all of the value that will make it worthwhile for the profit equation is still all speculation ATM.

Also every boomer boss seems to believe themselves smart enough to turn it into a profitable strategy, as if it wasn't some super complex thing that they'll likely get wrong.

[-]

Beginning_Occasion@reddit

I agree with these points.

People don't realize that were still in the phase where companies take losses to try to gain market share. these companies will need to recoup their losses. This will lead to the biggest wave of enshittification we've seen.

Like, imagine a world where we have to pay a good percentage of a developer's salary per dev to pay for these tools. Dev salary of 100k? what if we will have to pay 30k in ai related expenses for this dev, and we've dug ourselves too deep to have any option otherwise. And what if the net productivity benefit is only like 10 percent?

Too bad we can't have reasonable discussions on this topic anymore.

[-]

eaz135@reddit

Yeah its a great point and consideration around the commercial model.

My assumption was always that when the models get to a certain level of capability, that is where you will start seeing tremendous commercial and government demand, to do things like medical research, scientific research (maths, physics, chemistry, etc). You'd imagine if/when the models get to that level of ability that the compute should be used to solve those real/meaningful challenges - rather than spitting out cat memes or React CRUD apps. You'd imagine that the likes of Pfizer, GSK, Bristol Myers, Raytheon, Lockheed Martin, Boeing - etc would all be bidding like crazy to have compute access to a model with that level of capability.

The issue is, if the model capabilities start to really plateau and don't get to that level - what prices do these companies need to set to keep the status-quo going at a sustainable level without going out of business? Surely in that scenario the prices would be way higher than what they are charging today.

[-]

DanielCastilla@reddit

And early research is showing that they are already plateauing (amount of training data and parameters needs to grow too fast to reach measurable levels of improvement over previous models), hallucinations are increasing (maybe due to a lack of good training data), which further pushes the real cost vs usage disparity we have right now.

[-]

PoopsCodeAllTheTime@reddit

Summarizer parrot can't come up with original ideas, who knew!

[-]

ExtremeAcceptable289@reddit

I mean via api which a lot of people use, they already make huge profit margins. For example deepseek revealed that their margins were around 5x, and deepseek is one of the cheapest providers. It's mainly the subscription based tools like GitHub Copilot, Cursor, Windsurf, etc.

[-]

ICanHazTehCookie@reddit

Do you have a source? Most recently I read https://www.wheresyoured.at/wheres-the-money/ and it seems AI providers are bleeding money, even on paid users. The "On API Calls" section proposes that it's a small portion of their userbase, but unfortunately doesn't have numbers on the margins.

[-]

ExtremeAcceptable289@reddit

https://www.outlookbusiness.com/start-up/news/deepseek-claims-daily-profit-5x-higher-than-cost-boasts-545-roi

Heres about deepseek. This is an API-only company (they exclude webchat)

[-]

DanielCastilla@reddit

AFAIK the difference is that deepseek is a MoE (mix of experts) model, which means that inference is significantly cheaper

[-]

ICanHazTehCookie@reddit

Thanks! The article does note the numbers are hypothetical, but promising nonetheless.

[-]

tommy_chillfiger@reddit

I've been bringing up this point about market share/profit lately, seems like nobody sees this coming but it feels obvious to me. I feel like companies are going to build their entire operations around LLM tools and then when it comes time to actually price for profit (or, god forbid, price in some of the fairly gnarly externalities of these data centers) it'll be like "whoops! now it's >$25 per 10 word prompt. good luck!"

And/or, as you say, it becomes unusable due to something like ad supported tiers (which is such a disappointingly uncreative way to monetize literally everything online). Imagine having to watch a fucking 30 second ozempic ad or whatever to get your increasingly shitty LLM to spit out some CSV cleaning.

[-]

Which-World-6533@reddit

I completely agree with this. Being able to code competently in the future will be a more valuable skill.

Too many companies will rely on AI slop to produce products, just in the same way companies rely today on cheap third world workers.

Being able to fix all the slop and make something unique is going to be a key skill.

[-]

CNDW@reddit

I feel like the world is gaslighting me about AI. I keep trying to use it to boost productivity but the results I get from it are mixed at best. The hype does not align with my personal experience. My understanding about writing software over the last decade tells me that the output is complete slop. Barely functional, full of logical gaps. It doesn't come close to meeting my personal standards for good code, so I spend as much or more time cleaning up and fixing the output as I would have spent just writing it. I know how it works, and knowing the technical details just makes me trust it less. If I can't trust it, how am I supposed to leverage it? I keep being told it's the new way, I keep trying but all I see are the drawbacks with very minimal upsides. It feels like the industry is having a fever dream and I'm just over here waiting for the fever to break.

[-]

ghost_jamm@reddit

I know how it works, and knowing the technical details just makes me trust it less

I keep coming back to this article called “Chat-GPT Is Bullshit”. It argues that LLMs fit the philosophical definition of “bullshit”, essentially that they are totally unconcerned with whether or not their output is truthful. I don’t know how you argue against that. They literally aren’t designed to give correct output and can’t know if their output is correct or not. In what other technology would we find that acceptable?

Combine that with the massive environmental impacts and the huge violations of copyrighted work and it doesn’t seem hard to me to make the case against using LLMs.

[-]

CNDW@reddit

Honestly this is the thing that worries me most. The general public doesn't understand that nuance, they just see a sophisticated chat interface that talks back to them as if it's another person and understandably assume it's "thinking". How much damage is going to be done to our society before people start to understand that it's unreliable at a fundamental level? We already see people trying to use it for scientific papers, thinking when it spits out citations it's because it's reasoning about things with actual resources and not just spitting out bullshit. At some point this understanding will reach the general population, but at what point?

[-]

PoopsCodeAllTheTime@reddit

Barnum effect, same as horoscope, which is very popular even though it is obvious BS.... so it tracks

[-]

xmBQWugdxjaA@reddit

Occasionally it works really well - especially for translation tasks.

So like we update some API and need to fix 5000 lines of unit tests, give it some examples and let it try.

But I really only use it for tasks a bit like that, and sometimes desperately to rubber-duck with hard bugs, once Gemini did find a really tricky one!

[-]

CNDW@reddit

This is way more inline with what I've experienced. Translating files in some way, avoiding a bunch of tedious edits, gives me mixed feelings. I have a hard time trusting that it doesn't hallucinate the translation and as a result I spend a lot of time going over the results with a fine toothed comb, which results in me finding hallucinations and I trust it less. There is some nexus point in which the amount of time I would have spent doing it myself with bulk find/replace would have been faster, but I can't know that until after I spent the time trying to get AI to do it.

I worry more than most about AI writing tests. What is actually valuable to test is a very tricky subject and people generally wind up testing implementation details over behavior, and I have seen it hallucinate a false positive in more than one instance. Although I find I think about testing in a way that most people don't, which doesn't help the feeling of gaslighting. Like for me to embrace AI I feel like need to let go of my own personal quality expectations and just embrace mediocrity.

Maybe none of it matters? Humans are inherently flawed (myself included) so who cares if the tool I'm using has similar flaws in the output? Most of those flaws are the distilled result of human flaws in code output anyway because of how the model is trained, so maybe it's fine and I'm just being annoying?

It's been most useful getting it to explain cryptic code to me as I work on understanding it.

[-]

PoopsCodeAllTheTime@reddit

So you are telling me that LLM is a wonderful tool of malicious compliance that lets me hit the inane quota of 100% test coverage regardless of false positives. Sounds great! I'll buy 10.

[-]

xmBQWugdxjaA@reddit

Yeah, I feel exactly the same way as you, like I want it to handle the boilerplate, but not just write everything as I want to check the tests actually check edge cases, etc.

That said I was debugging some Docker firewall edge case today and it's so nice to just talk to Gemini about it. Like no-one else I know uses Linux and firewalls enough to help!

[-]

wvenable@reddit

Do you google for stuff?

Instead of doing that, ask the AI.

That's the easiest way to start.

Remember the AI isn't smarter than you but it's likely more knowledgeable than you. So don't give it tasks where you have think instead give it tasks where you need to know something that you don't. It's a subtle distinction.

I can give you an example, this week our sys admin pasted this to me in the chat:

App1: requestedAccessTokenVersion": null
App2: requestedAccessTokenVersion": 2

I took that exact message with no other context and pasted it into ChatGPT and it told me everything I needed to know to understand and solve the problem.

[-]

CNDW@reddit

That's what I do now TBH. The only useful thing I've noticed is googling for basic things and "explain this line". Depending on the source material, results are questionable. If it's anything other than basic language questions it breaks down and starts hallucinating like crazy.

Asking it docker questions is very productive, working o. Docker builds is the thing that I don't do often enough to have built a working memory of syntax or how thing behave, and how it works has been relatively unchanged over the years so the results are very consistent.

None of this goes further than the "huh, that was kinda neat" threshold for me, certainly not the earth shattering new tech that's going to change the world that it's being hyped up to be. It's just a better google search.

[-]

wvenable@reddit

If it's anything other than basic language questions it breaks down and starts hallucinating like crazy.

What are you using? I'm using paid ChatGPT and I'm asking it complex and esoteric things and it has no problem with it. I would argue there's actually no way that I could figure some of this stuff out on my own without spending days reading documentation and/or perhaps getting into the source code of some 3rd party libraries.

[-]

CNDW@reddit

I've been using paid Copilot for most things. I've tried free tier chat gpt, and a paid version of Gemini. None of them have felt any different to me.

[-]

wwww4all@reddit

The hype cycle is 3 years into AI replacing software devs in 6 months pablum.

There’s really no answer possible, when people are trying to hammer in non deterministic tools to solve deterministic problems.

[-]

noonemustknowmysecre@reddit

The hype does not align with my personal experience.

Yeah, me neither. It still can't crank out a simple script to do a thing without functionality breaking bugs. And debugging is far more labor intensive than writing. The output will look good, but it won't work. Even arguing that "well it gives you something work off of" is just bullshit as green-field development is fast and easy while legacy development is slow and painful.

I keep trying to use it to boost productivity but the results I get from it are mixed at best

The good parts are it's a phenomenal search tool. If there's hundreds of people using a library out there and talking about it online, you can ask GPT about it and it'll whip out a very nice and detailed explaination of what exactly is going on with any part of the library. It was a great help with taxes. "Why is the amount of capital gains taxed in the 0-15K range zero?" And it'll give you about 4 paragraphs explaining why you're an idiot and how cap gains start counting from the top of your income, not a seperate track like I had thought.

But it can't just make the stuff for you, not yet. Not without a very precise and detailed explaination of what to go make. And what do we call a very precise and detailed set of instructions to a computer about what to go do? Code. We call that code.

[-]

eaz135@reddit

Its definitely an emerging tech, and as an industry we are still discovering where/how it fits, and how to get the most out of it.

The notion of fully autonomous AI agents replacing software developers completely in the very near-term is farfetched, and I think its the wrong way to look at AI in our industry. However - that doesn't mean there haven't been really amazing outcomes of using AI to enhance software development.

Have a read of this article from AirBnb - it will open your mind on intelligently applying AI in certain situations where it makes sense:
https://medium.com/airbnb-engineering/accelerating-large-scale-test-migration-with-llms-9565c208023b

[-]

creaturefeature16@reddit

See, to me, If software never evolved and was always a known quantity and scope, then yes, that would spell the end of engineers and developers alike.

But this kind of stuff just shows me that we're going to push these systems to their limits and increase the complexity of the software we're writing, which means we'll always not only need engineers and developers, but chances are will need the same amount if not more.

[-]

tooparannoyed@reddit

It’s great for well known solutions that would normally require me to google up an existing implementation or SO discussion and then consult docs for syntax. Even then, it hallucinates features if my use case is different enough. I also normally have to tell it to condense or fix something arbitrarily verbose.

It’s speed at the cost of accuracy and efficiency.

[-]

enjoirhythm@reddit

I went to a "coding with AI" conference this week, my job offered to take me, and saying no felt like it was out of the question.

Despite the name (maybe this is up for interpretation), the presentation was mostly non-technical business guys asserting this stuff is the future while using incredibly trivial examples to show it off, and telling me that I'm falling behind if I'm not using it.

I'm sorry, but getting on stage using a 260 dollar Claude pro subscription to vibe code a chatGPT wrapper is not helpful to me. The fact that you used a mystery box to write an app that leans on another mystery box for your business logic is.. I dunno, it's something.

Later on there was a panel where I politely asked about ways that they had used ai in a more long term project with some real complexity behind it. Their responses felt intangible and unsatisfying. One guy even came off as gleeful that he could decrease the size of his workforce by offloading the smaller junior developer tasks onto a service like Devin. I felt like I was showing restraint, but I was definitely annoyed by the end of it all.

What's worse is when I hear other employees describe their experience with the event, they generally say that it was very interesting and it made them excited about the future. I genuinely don't understand, what are they seeing that I'm not?

I'm not opposed to using this stuff if it's helpful to me, but all I'm seeing so far is people demonstrating an AI playing tee ball and then telling me that it's ready for the major leagues.

[-]

marx-was-right-@reddit

No one, including leadership, knows whats going on in the magic box, even though its public knowledge its just a text preditction algorithm . They atrribute it to magic and are afraid to be the guy that refuses to use the magic and gets fired.

Ive been fairly outspoken about irresponsible AI use - think massive hundred file PRs chock full of security issues and flat out broken code - and ive gotten a talking to by management to tone it down. I responded why all these devs arent producing at a higher level if this was such a magic tool, and they had no response.

[-]

PoopsCodeAllTheTime@reddit

The LLM is naked but no one dares to say so

[-]

Castyr3o9@reddit

Keep your head in the sand then, the tooling is maturing and the cutting edge is impressive. We’re figuring out how to glue the intelligence together to solve more complex and abstract problems and automate more. While software engineers won’t be obsolete for a long time, far fewer of us are needed.

[-]

wwww4all@reddit

What complex and abstract problems have AI solved?

All AI companies are still using human devs to build AI tools.

[-]

CodyEngel@reddit

AI works at the same level as a junior software engineer, it will put out working code but whether or not it's the correct code is still on you.

There is a new setup process as well. They will read from rules files and getting good at prompting and setting up those files is also crucial. You still need to know how to code. I would love to see a production application built and maintained by a PM or business stakeholder, mostly because it will almost certainly fall apart.

That said though, if you aren't using AI, you are leaving a lot on the table. I'm still in my early days of this but folks at anthropic are kicking off multiple agents to work on multiple projects. Imagine if you didn't need to focus on just one project but could instead supervise several being built out.

[-]

wwww4all@reddit

Pretty much every company that claimed to be using AI in production had to walk it all back, confess they’re still dependent on humans. Amazon, builde.ai, klarna, etc.

[-]

Abject-Substance1133@reddit

I need us senior devs need to realize how much of this is cope before realizing that it’s going to be coming for your job too in maybe less than 3 years.

You see it here. Don’t let your hubris get the best of you. If at the rate it’s learning, it will come for your jobs. And if it comes for your jobs, it’s going to be coming for every white collar job.

What happens to 44% of the workforce in America - white collar workers? Also, keep in mind, the real big fear behind AI is how it can be used to manipulate you. That’s why people are so against it, even morally. A company could use AI to push insane narratives. It’s so convincing. It’s all at the whims of the companies to make money.

Think about how so many people are dependent on ChatGPT. I’ve seen people really start to automate their jobs with it. To me that means an AI company could tell you anything about news. Anything about technology. Anything to subtlety push your own thought process.

Humanity will become redundant. People will see no future. It is scary.

[-]

Usernamecheckout101@reddit

Thanks

[-]

PoopsCodeAllTheTime@reddit

I love LLMs, aider.chat with o4 really saves my fingers the trouble of typing all those tags for an HTML table!! 😂

[-]

_ncko@reddit

The way I see it, code is not an asset but liability and LLMs just help non-technical people generate more of it. I think this creates less demand for SWEs in the short term, but more demand in the long term.

[-]

lilcode-x@reddit

Sometimes AI works, sometimes it doesn’t. I recently had to do a codebase-wide refactor, just a simple renaming of components. I had copilot do it and it got like 80% of it correct and the rest was relatively quick to do. It was a very trivial task that wouldn’t have taken me long to do either way but it was nice to just let copilot do it to save a bit of cognitive energy on my end

[-]

slasher71@reddit

As an immigrant who got laid off recently, it feels nice to understand the details but nothing fixes the panic than trying to prep and apply everywhere. Thanks for trying to explain it. The tax code change, the over hiring during the pandemic, AI being amazing but not as awesome as all the ceos are claiming has been crazy. It would be great to see things change but this being an employers market means employees all across the board will be mistreated for the next 2-3 years and management on the most will show the worst part of themselves as they power trip. Question is- how best can we be professional and understand our tooling the best and pushback with data?

[-]

failsafe-author@reddit

The main issue is the higher ups pushing it without a clear problem to solve.

[-]

Additional-Map-6256@reddit

This is just wrong. It looks like you read the post over on cscareers about the tax bill and fed it to an AI to generate this post. The tax changes took effect in 2022, not 2025.

The layoffs are happening now because of extreme overhiring during COVID, combined with interest rate changes, and the (mostly false) promises of AI. Those of us who actually understand software know that AI is not going to do all that those selling it promise, but unfortunately we are not the ones in charge of headcount. Those are the non technical executives who don't realize they are being played for a cash grab.

[-]

thephotoman@reddit

No, he's right. The problems started in 2022, when the tax changes took effect and interest rates rose quickly.

Sam Altman and the other hucksters selling a Markov chain generator as a solution to every CEO's concern that he can't exploit his workers are a part of the excuse we're being sold, and they are convenient scapegoats for the proponents of the current tax code.

And again, yes, we're currently in an offshoring part of the cycle (things work, we just need someone to maintain it, and it'd be nice if that work happens overnight). But that effect is being magnified by the 2022 tax changes, interest rates, and political and economic instability in the US.

[-]

ButteryMales2@reddit

What beef do you have with sentence case?

[-]

dodgerblue-005A9C@reddit (OP)

i don't know, never so much point in them. or just lazy. fellow ocd

[-]

humanguise@reddit

I agree, but management is drunk on their own Kool-Aid, and they're the ones making the decisions to lay people off.

[-]

IlliterateJedi@reddit

suspected to be ai slop feeding ai

Is there a citation from actual AI researchers that support this is the reason for increased hallucinations in reasoning models? Because I can imagine other more straight forward reasons that these models would hallucinate that have nothing to do with training data.

[-]

syklemil@reddit

My stance here is more in the direction that

LLMs are ultimately machines that produce output that seems likely. They don't produce correct output, they produce believable output. If you know what you're doing, they can accelerate that. If you're unsure of the path to take, and especially if the correct solution would look surprising to you, then you're not able to tell when an LLM is taking you on a wild goose chase.
People thinking code-generating tools will replace devs don't know what devs do, and are vastly overestimating how hard the actual act of coding is. Wake me up when LLMs replace meetings, discussions with clients, budget discussions, etc. Or when UML and low-code takes all our jobs, I'd like to see that too.
Ultimately trying to keep competence at a minimum is only a viable strategy for companies that produce shovelware. People who use a tool to produce output they don't really understand are placing themselves at great risk to repeat all the pitfalls that people have been telling stories about at dailywtf and programminghorror and whatnot.

[-]

wwww4all@reddit

The basic gist is it’s all hallucinations, just some hallucinations may be correct than not, maybe.

[-]

dodgerblue-005A9C@reddit (OP)

while i agree with all of your points, i'm trying to highlight the mismatch between the narrative and actual use case. the collective compliance of our community in pushing this narrative

[-]

robby_w_g@reddit

I don't trust anything the tech community hypes up after the blockchain fiasco. As long as there's potential money behind it, this community will push inefficient, useless technology like with almost religious intensity of faith.

LLMs may prove to be useful long term, but I'm going to trust my own eyes and experience rather than listening to con artists again.

[-]

syklemil@reddit

Yeah, I'm also kind of trying to push out that narrative and repeat some notions that should hopefully let people have a less wild idea of where LLMs are taking us.

Also I try to be somewhat consistent in referring to the technologies we're talking about as LLMs. It is kinda in line with the history of AI research that once they achieve something, it stops being considered "AI" and gets a separate name, like voice recognition and the like. Ultimately AI is a bit too much of a sci-fi concept for good use in conversation about real technologies.

[-]

messick@reddit

Anyone who has been in this business for so short a time that they think this what a job market slowdown looks like is not actually experienced enough to be a part of this sub.

[-]

noonemustknowmysecre@reddit

"ai choosing to not shut itself down", ... is all an attempt to hype up.

Yep. But there's some reasonable fear there. It's not nearly the hollywood level of bullshit that some people have bought into. But things like this are a legitimate reason not to wholly trust their output.

using the terms like "reasoning", "thinking", "hallucination"

. . . That part isn't hype at all. They reason and think and can extrapolate data. That's "creativity" when we like it and "hallucinations" when we don't.

improvements have plateaued

Counterpoint

The evo videos are a real step up.

But looking at benchmarks... Yeah, I think you're just plain wrong on this one. Why do you think they aren't improving?

as an experienced professional,

Right. We are engineers. We engineer these things. It used to be that only we could work on this stuff because to do anything with it needed in-depth understanding of how to program. If that's removed, the boss's dream is to fire us and hire a bunch of street urchins for pennies that can drive the cheap machines that do the heavy lifting. Industrial revolution, autolooms, luddites, we've been here before.

People forget that there is real skill behind all that engineering work. But if if a street urchin can follow the DO-178 process (as instructed to him by a DO-178-Bot-o-tron), and have the actual thing, the documenation for the thing, and the test code verifying the thing does what the documentation says... I mean that'll probably get close. If all this process can get Jr. Devs in line, it'll help the urchins too.

And all that said, where there's real critical code where millions are on the line or it's life-critical code, they're still going to get skilled professionals that don't randomly invent case-law or imaginary libraries.

we don't want to appear like someone who's averse to change,

ooooooh yeah. Boss says to try to get AI to generate unit tests? Suuuuure thing boss. It pays the same either way, even if I have to spend twice the time debugging just whatever the fuck this thing was thinking.

[-]

daishi55@reddit

increased hallucinations

Haven’t noticed this. Any sources?

[-]

dodgerblue-005A9C@reddit (OP)

https://builtin.com/articles/ai-hallucinations-worsening-solutions

https://www.reddit.com/r/ArtificialInteligence/comments/1de5ug4/google_study_says_finetuning_an_llm_linearly/

https://timesofindia.indiatimes.com/technology/tech-news/ai-models-like-chatgpt-and-deepseek-frequently-exaggerate-scientific-findings-study-reveals/articleshow/121189880.cms

[-]

daishi55@reddit

First is an ad. Second 2 aren’t claiming what you’ve said.

These tools work exceptionally well for generating good code.

[-]

djnattyp@reddit

These tools work exceptionally well for ~~generating good code~~ copying existing code of unknown provenance from the datasets they were trained on.

[-]

daishi55@reddit

No, they don’t copy anything. The strings they generate are not stored anywhere in the models

[-]

Khandakerex@reddit

Yeah this dude doesnt have a clue what he is talking about. He's as bad a pro ai circle jerk.

[-]

liqui_date_me@reddit

As always, there’s a bit of truth to everything.

AI hasn’t taken over my job, but it’s made a few things ridiculously more efficient. Some tasks that used to take me 30 minutes of boilerplate code now take 2 seconds, but those are 1-5% of all of my tasks. I expect that the fraction of tasks to go up, however.

[-]

AchillesDev@reddit

all this "ai choosing to not shut itself down", using the terms like "reasoning", "thinking", "hallucination" is all an attempt to hype up. fundamentally if your product is good, you don't have to push the narrative so hard! does anyone not see the bias? they've a vested interest, they're not psychologists or have any background in neuroscience (at least i think)

I do, and also do a lot of genAI work (but on the applied side). The real thing here is terrible reporting. These stories are all from non-production stress tests and the like where there are no limits/guardrails. These situations don't (and can't) happen in normal production.

[-]

DeterminedQuokka@reddit

Honestly, I feel like both extremes are wrong. AI is more useful than the people who yell about how bad it is say and it’s less useful than the people saying they are all smarter than us and are going to leave us behind say.

Like is it interesting that AI can find a zero day, yeah sure. But if you have to generate 100 reports and one contains a real zero day and 99 are trash it didn’t actually save you any time. But it might have just found a super significant bug. Let’s all just agree to be realistic.

[-]

Fancy-Tourist-8137@reddit

/r/USDefaultism

[-]

Cultural_Ebb4794@reddit

The US is the beating heart of the software industry. It literally is the default in this case.

[-]

Fancy-Tourist-8137@reddit

Lmao. Sure thing bro.

[-]

Available-Table2446@reddit

I think the very possibility of AI being used to replace devs and satisfy vibe coding CEOs is the problem.

I blame the government for not regulating this shit.

If you need AI to not pay fair wages or suppress wages and replace tax paying citizens, F u, you're not allowed to use it. If you do go ahead with it, we should be taxing the living hell outta them. For every developer you fire in the name of AI, you will pay 2x the tax.

If you are going to be using AI to assist with healthcare, live saving diagnostics based on data, yeah, go ahead and keep selling, keep innovating. If you're going to use AI to provide scientists with the information to solve earth's problems, yeah go ahead.

But hey, what do I know. I pay taxes and I hope I don't get laid off on account of these vibe coders.

[-]

zombie_girraffe@reddit

"ai refusing to shut itself down"

That's called a stuck process, kill -9 solves that problem.

[-]

lab-gone-wrong@reddit

It wasn't even that. Glancing at the logs quickly revealed that it generated the wrong path to the shutdown script and therefore couldn't find it.

Too dumb to shut itself off. And the people reading the news headline are trained by society not to check the primary source so they accept the false story over the publicly available reality.

The intelligence of the reader who doesn't follow through is as artificial as the LLM's.

[-]

stevefuzz@reddit

I work for a company that has become very successful at finding ways to actually use bandwagon tech. Big data, AI (nn and ml), and now LLMs. I think that LLMs are amazing and have some incredibly useful and interesting applications. I just don't see development as the be all end all use case. I find it silly that LLMs can be applied to so many interesting things, yet, the push is to just replace devs and doctors.

[-]

CoochieCoochieKu@reddit

What you think is irrelevant, tell me what your CEO thinks

[-]

kyriosity-at-github@reddit

1) CEOs aren't to think - they are to speak

2) "Money, money, money" (c) Abba

[-]

stevefuzz@reddit

Lol seems like a better application for LLMs than development.

[-]

loxagos_snake@reddit

My CEO is a business school kind of guy. He's great for steering the company and convincing people to do stuff. He also looks and sounds good on camera.

I wouldn't trust his opinion on AI for the same reason I wouldn't trust him to fix my tap.

[-]

DonaldStuck@reddit

CEO's are way too often seen as oracles but I've heard they're actually humans like you and me.

[-]

dodgerblue-005A9C@reddit (OP)

i agree, we've an economy which is biased towards capital first and labour last.
but i'm trying to gauge our community's collective sentiment

[-]

dodgerblue-005A9C@reddit (OP)

luckily, the cool aid is not pushed so much at my place!

[-]

Reddit_is_fascist69@reddit

Gov job isnt going to let us connect to AI outside their network so im fine for now.

Otherwise, it feels like a glorified web browser. And id rather not vet it like OP mentioned when i can see the website and instantly know how trustworthy the content is.

[-]

Healthy_Albatross_73@reddit

increased hallucination reported

Is that true? I don't think it's true. AI models get better training on their own data.

[-]

Pozeidan@reddit

If you're a software engineer and not a coder (or code monkey) then you're fine. AI really saves time when writing code, but for the rest it just can't do well. You still need someone to check the code and guide the AI and write the code when AI doesn't help.

It's still a major advancement and software engineers are actually AI guinea pigs in the sense that this profession integrates AI in its workflows rapidly. It allows us to really understand its limitations.

[-]

NoleMercy05@reddit

Copium

[-]

Cultural_Ebb4794@reddit

Surely you can proompt a better rebuttal than that.

[-]

DramaticCattleDog@reddit

I got rejected from a role at the final round after the panel asked me my thoughts on AI. I simply told them it's a solid tool to assist development, but "vibe coding" has many issues in my book. I received high marks on my architecture and live code rounds.

Within an hour of the interview ending: "we regret to inform you we have chosen to proceed with another candidate whose skills more closely align to the role"

[-]

Repulsive_Constant90@reddit

AI is a hype. It’s where money is and people love money. You can see dirt to anyone if you know how to do a packaging.

[-]

nonades@reddit

We're in for a reckoning soon about the absolutely astronomical costs for AI (both in talks of companies investments in data centers and the environmental costs of those data centers) and the perceived gains from it.

I had to tell my coworker again to stop parroting what Gemini says when he looks up stuff about Azure because it's consistently wrong

[-]

flavius-as@reddit

About point 5: it's totally worth it and it's the only in-code activity which it's great with AI.

It's also aligned with how LLMs work: they're predicting machines so... let them predict.

[-]

Sheldor5@reddit

humanity is doomed, people are getting dumber every second, the vast majority lacks intelligence to observe themselves and their change in behaviour or addictions (social media, tiktok, outsourcing thinking) ...

I just try to avoid those people to keep myself sane

[-]

csanon212@reddit

Point #1 about Section 174 is mislead. This change surprised startups in early 2023 when it took effect, right as the first wave of big tech layoffs hit. It affects companies which are not cash flow positive the most, which are startups. Since that time, VC finding and hiring levels at startups have adjusted over 2 years. Reversing this provision tomorrow has no effect for big tech other than maybe increasing the industry mood. Fortune 500 companies won't reverse massive offshoring on a whim and a vibe.

[-]

dodgerblue-005A9C@reddit (OP)

speaking on hyperbole, a lot of vcs are backed by big tech and they've been cannibalizing startups all over the place to avoid getting usurped by innovation from smaller teams while they're stuck in mediocrity and bureaucracy

[-]

Pleasant-Direction-4@reddit

I totally agree with you. I use these tools everyday in work as well as personal life. If your problem is already solved and available on the internet, it will help you find faster else it is just waste of time. It is also good at following patterns, so you can feed it examples of something you have already done and wait till generates the output and change it as per your need.

I don’t offload it any logical tasks though, as it will train my brain to be lazy, it is just a probabilistic word predictor, most of the time it will be wrong. Plus the reliability of these tools are very low for slightly involved tasks.

The key benefit here is you can use it to quickly onboard to a new technology or help you understand a codebase faster

[-]

kyriosity-at-github@reddit

> the real reason is the tax code change in 2017

Corporate greed and management self-overestimation first.

There were no tax changes to outsource projects since 2000 to unqualified developers (with failure as result).

[-]

rayfrankenstein@reddit

Sometimes you just have to laugh at it.

https://youtube.com/shorts/3eQnrLJ_klQ?si=vtmzNYgxt7GMc3UT