The I in LLM stands for intelligence

[-]

philipquarles@reddit

I bet a good LLM could get this joke though.

Reply

[-]

Incidentally, when I first started playing around with chatGPT I thought that it could identify the jokes I was making; because I'd say "do you see the joke", and it would say something like "yes, it is a pun about pirates" or whatever. ... and that was true. But then after digger deeper with questions like "can you tell me explicitly what the pun is", I find that it was almost always wrong. LLMs are very good at sounding convincing. The give very plausible answers, including to questions like "what was this joke about" - but as with everything they say, the answers are just statistically good guesses.

Reply

[-]

LookIPickedAUsername@reddit

Were you using 3.5 or 4? 3.5 really sucked at understanding jokes. In my experience it would confidently explain things in a completely incorrect fashion, and no matter how many hints I would give it ("Actually, the joke relies on the fact that 'who' and 'hoo' are homophones"), it would just say still-incorrect things like "Oh, I apologize for my earlier misunderstanding. The joke relies on the fact that 'hoo' is pronounced the same as 'owl', making this a humorous pun about owls.". It clearly never actually got it. I just tried a few puns in 4 and it nailed them. Here's its answer to "What do you get if you cross an elephant and a rhino?" (with the context that it knew I wanted it to explain why the jokes were funny): 'The answer to this joke is typically "elephino," which sounds like "hell if I know." It's a humorous blending of the two animal names to sound like a common phrase of confusion or lack of knowledge.'

Reply

[-]

BigHandLittleSlap@reddit

> Why is everyone downvoting this…? I've noticed that there are a lot of luddites out there feeling vulnerable about their future employment prospects. Any narrative that reinforces their world view of "Hurr-durr, AI is stupid!" is voted up. Any contrary opinion, whether factual or not, is voted down. Because we all know that voting down the comment outlining the problem makes the problem go away, right? GPT4 can solve just about every "ChatGPT can't solve" problem, and is already significantly out-of-date. It developed and in 2021! GPT5 is coming this year, and God only knows what will happen in the next few years...

Reply

[-]

blind3rdeye@reddit

Downvotes are often about how people feel about what is said rather than about the meaning & quality of what is actually said. So in this case, I'd guess that you're getting downvoted because it sounds like you're defeating chatGPT, regardless of whether that's true or whether what you're saying has merit. To answer your question, I was using 3.5. So probably it has improved. I'd still expect it to have a pretty good answer for 'common' jokes, and a relatively poor answer for jokes that I've invented myself; but I don't know. I don't have easy access to 4.0 to test it. To be honest, it's almost unfair to expect the AI to understand jokes based similar sounds or similar spelling - because the AI can't see those things. It doesn't have access to how words sound; and surprisingly it also can't directly see the spelling either. The text you give the AI doesn't go directly into its neural net, but rather it is first turned into 'tokens', which are nothing like the letters or symbols that you use. It also answers using tokens, which are then translated back into letters for you to read. So the AI never sees the spelling of words. It basically just has to 'remember' what someone told it the spelling was; and that would make these jokes a lot harder to understand. And for sounds, obviously its even harder. So yeah - I don't really expect that it will be doing a great job with my jokes any time soon. The main point of my previous post wasn't so much that it isn't great with jokes, but rather, it can often seem to understand things that it really does not understand at all.

Reply

[-]

striata@reddit

This type of AI-generated junk is a DOS attack against humanity. Bug bounty reports, Stackoverflow answers, or nonsense articles about whatever subject you're searching for. They're all full of hallucinations. It'll take longer for the reader to realize it's nonsense than it took to generate and publish the content.

Reply

[-]

eigenman@reddit

For programming and math, it wastes so much time because at first glance it looks kinda ok. Then you work it out and it's wrong 50% of the time. Way better tools out there for this than LLM.

Reply

[-]

SanityInAnarchy@reddit

Github Copilot is decent. No idea if LLM plays a part there. It *can* be quite wrong, especially if it's generating large chunks. But if it's inserting something small and there's enough surrounding type information, it's a lot easier to spot the stupidity, and there's a lot less of it.

Reply

[-]

SuitableDragonfly@reddit

Github Copilot reproduces licensed code without notifying the user that they need to include a license.

Reply

[-]

alluran@reddit

Prove it Microsoft has a multi-billion-dollar guarantee behind it saying that it doesn't. Or a reddit user with 3 karma. I know which one I'm believing.

Reply

[-]

SuitableDragonfly@reddit

Someone actually showed it doing this in a demonstration. I don't know what other proof you need.

Reply

[-]

alluran@reddit

I can turn the guardrails off and ask it to reproduce copyright code too. I can't teach you to read though.

Reply

[-]

SuitableDragonfly@reddit

There's nothing Microsoft can do to prove that it won't reproduce copyrighted code, in any mode. The whole point is that the output is nondeterministic, so they can't guarantee anything about it. It doesn't matter what they say about it, they can't change that fact.

Reply

[-]

alluran@reddit

Like I said - you can lead a horse to water, but you can't teach it to read.

Reply

[-]

SuitableDragonfly@reddit

No written text can change the fact of what Copilot is. I have no interest in reading whatever bullshit Microsoft made up about it.

Reply

[-]

alluran@reddit

It's ok - you don't have to understand technology - you can complain about it just like all the politicians that brought us wonders like the war on net neutrality, and remote id

Reply

[-]

SuitableDragonfly@reddit

I have a master's degree in Computational Linguistics, which includes technology like Copilot. I probably know more about how it works than you do.

Reply

[-]

alluran@reddit

I literally linked the guarantee in this thread. Sounds like you went to the university of Weetabix 🤣

Reply

[-]

SuitableDragonfly@reddit

I literally read the guarantee and it literally does not guarantee anything about Copilot's output. I already explained exactly what it does guarantee, if you are too lazy to read the link that you yourself posted, you could at least read my summary.

Reply

[-]

alluran@reddit

Sorry, I admit I didn't read the full link I provided, I just googled for an article that covered the topics discussed in our call with Microsoft.

Reply

[-]

SuitableDragonfly@reddit

If "the topics discussed in our call with Microsoft" involved an actual guarantee about the content of Copilot's output, I think Microsoft lied to you, sorry to say.

Reply

[-]

alluran@reddit

Again, who am I going to believe - the Reddit "computational linguist" or contracts we received relating to the product 🤣 Sorry that I'm unwilling to share company documents online to make an otherwise simple point. If you're unable to think of any possible way for Microsoft to make said guarantee, then I question your degree, as there's some pretty basic methods that can be used, especially for a product as focused as code copilot.

Reply

[-]

SuitableDragonfly@reddit

If it isn't in writing, it's not in your contract. The actual guarantee that they are offering is the one you linked, not whatever they told you over the phone. You can choose to believe an unofficial, not legally binding thing that some sales rep told you, or you can believe your actual contract that's in black and white. Up to you, I guess.

Reply

[-]

alluran@reddit

Ok, you win, enjoy your karma, and I'll enjoy comfortably using "your" research to accelerate our teams. Next time I'm in a call with our Microsoft rep, I'll let them know they should hire you as you clearly know so much more about their products than they do. Take care!

Reply

[-]

SuitableDragonfly@reddit

All I've done is read what Microsoft actually published on their website. You could learn to do the same thing, you don't need a Microsoft rep for that.

Reply

[-]

alluran@reddit

Shhhh, you've won now. It's over.

Reply

[-]

SuitableDragonfly@reddit

I don't give a shit about winning arguments on reddit. I want people to actually learn what AI is and is not good for, rather than uncritically swallowing Microsoft's propaganda about it.

Reply

[-]

alluran@reddit

OK - you win twice now. Congratulations. I'm sure your OpenAI offer letter is in the mail.

Reply

[-]

SuitableDragonfly@reddit

You're really challenged when it comes to reading, aren't you?

Reply

[-]

alluran@reddit

Yes - I am. You win again.

Reply

[-]

SanityInAnarchy@reddit

What do you mean by "multi-billion-dollar guarantee", exactly? I mean, never mind that you're wrong and it's been caught doing exactly this, I assume Microsoft didn't actually pay out a billion-dollar warranty claim to the user who caught it "inventing" `q_rsqrt`.

Reply

[-]

alluran@reddit

> do I get to defend myself with Microsoft's lawyers? Yes - that is literally the guarantee they provide, if you're using copilot with their guardrails. Just because the free version doesn't have enterprise features doesn't mean I'm wrong at all - just means you need to learn to read.

Reply

[-]

SanityInAnarchy@reddit

Hmm. It's a good idea, but I'm not sure how much I'd trust it: > Require the customer to use the content filters and other safety systems built into the product and the **customer must not attempt to generate infringing materials,** including not providing input to a Copilot service that the customer does not have appropriate rights to use. Seems reasonable, but when those Microsoft lawyers turn on you, how sure are you that you can prove nothing you did was *attempting* to generate something infringing? Nobody said anything about enterprise features. I guess it didn't occur to anyone that they might paywall this. No, the concern was that Copilot has already been demonstrated to produce copyrighted code. I'm glad Microsoft has faith in the guardrails they've added *since* then, but that doesn't make the concern invalid.

Reply

[-]

alluran@reddit

All prompts, settings etc are going to be logged by Microsoft. The way the functionality works it really isn't going to "accidentally" infringe, because it's comparing output and refusing to return it if they find it verbatim in their training material. Your point is valid, but doesn't align with the product they're selling enterprise customers. These aren't enthusiastic hobbyists they're selling to, but big hitters that are very risk averse, and their sales pitch goes into it in great detail.

Reply

[-]

SanityInAnarchy@reddit

> it's comparing output and refusing to return it if they find it verbatim in their training material. It's still possible to infringe if you find something substantially similar, even if it isn't verbatim. If it's only checking for verbatim results, it's possible to miss stuff. > Your point is valid, but doesn't align with the product they're selling enterprise customers. I know at least one enterprise customer doesn't rely on this at all, and only allowed it after setting up a third-party system to scan each PR for possible infringement. Personally, there's a reason I only use this at work: At the end of the day, if it results in significant damage to the company, well, the company approved it, and I'm following company policy, so I've got no personal liability. But for anything I own, it'd be a bit of a more-practical Pascal's Mugging -- probably nothing happens, and if something does MS probably has my back, and if they don't I am *ruined.* It'd be worth the risk for something revolutionary, but it hasn't been that for me. > ...big hitters that are very risk averse... I'd hope so, but these hype cycles seem to be able to punch through a *lot* of that. I've seen decision-makers be reluctant about allowing humans to write basic automation, and yet these same people suddenly lose their minds over plugging in AI to do the same thing, as if an LLM is *less* likely to make a mistake than a Python script.

Reply

[-]

alluran@reddit

Right, but remember this guarantee cuts both ways. If Netflix uses this, and finds something OBVIOUSLY infringing, I'm sure they've got some lawyers that would love to see how deep Microsoft's pockets are...

Reply

[-]

psychob@reddit

Didn't [copilot reproduced famous inverse square root algorithm](https://twitter.com/StefanKarpinski/status/1410971061181681674) from quake? And then [just banned `q_rsqrt`](https://twitter.com/moyix/status/1433261377125326851) so it wouldn't output that code? I guess it's good that you believe it, because it requires certain amount of faith to trust output of any llm.

Reply

[-]

svick@reddit

Copilot now has a setting to forbid "Suggestions matching public code", so I don't think tweets from 2021 prove anything.

Reply

[-]

alluran@reddit

You'll never convince the doomers who are too busy shouting down anything related to AI to actually learn to read.

Reply

[-]

alluran@reddit

Didn't that one guy trying to invent parachutes kill himself jumping off the Eiffel Tower? Glad *you* believe in parachutes - takes a certain amount of faith! Or am I just being stupid by comparing things from decades ago to newly released products, contracts, and terms of service? I'll let you decide.

Reply

[-]

carrottread@reddit

This is a bad example of such licensed code reproduction. This function wasn't created by someone in id software, but was just copy-pasted from some other source (https://www.beyond3d.com/content/articles/8/ and https://www.beyond3d.com/content/articles/15/). So, while whole Quake 3 source code is under GPL, this function by itself isn't. Because of that this function was copied by thousands and that lead to copilot suggesting it. And looks like most (all?) examples of "copilot reproduces licensed code" turns out not very sound, just like claims of 'stealing' implementation of isEven function as `return n%2 == 2` from some book.

Reply

[-]

cinyar@reddit

> Microsoft has a multi-billion-dollar guarantee As in Microsoft will pay me a billion dollars if I get into legal trouble because of copilot code?

Reply

[-]

alluran@reddit

They will fight your legal battle for you.

Reply

[-]

SanityInAnarchy@reddit

Yes, it does badly if, say, you open a new text file, type the name of something you want it to write, and let you *write it for you.* It's a good reminder not to blindly trust the output, and it's why I'm most likely to ignore any suggestion it makes that's more than 2-3 lines. What Copilot is good at is stuff like: DoSomething(foo=thingX, bar=doBar(), There are only so many things for you to fill in there, particularly with stuff that's in-scope, the right type, and a similar name. (Or, if it's *almost* the right type and there's an obvious way for it to extract that.) At a certain point, it's just making boilerplate slightly more bearable by writing *exactly* what I'd type, just saving me some keystrokes and maybe some documentation lookups.

Reply

[-]

SuitableDragonfly@reddit

It sounds like you're just using Copilot as a replacement for your IDE? Autocompleting the names of variables and functions based on types is a solved problem that doesn't require AI, and is much better done without it.

Reply

[-]

LawfulMuffin@reddit

It’s autocomplete on steroids. It’ll often recommend that code block or more just by naming the function/method something even remotely descriptive. If you add a comment to document what the functionality would be, it gets basic stuff right almost all the time. It’s not going to replace engineers probably ever, but it’s also not basic IDE functionality.

Reply

[-]

WhyIsSocialMedia@reddit

> It’s not going to replace engineers probably ever I'm amazed how little people even here understand about these networks. These language models are absolutely absurdly powerful and have come amazingly far in the past several years. They are truly the first real general AI we have. They can learn without being restrained, they can be retasked on narrow problems from moving robots or simulated environments all the way to generating images etc. They have neurons deep in the network that directly represent high level human concepts. The feeling among many researchers at the moment is that these are going to turn into the first true high level intelligence. The real problem with them at the moment is they have very poor to no meta level training. They simply don't care about representing truth a lot of the time at the moment. Instead they just value whatever we value. This is why something like ChatGPT is so poor, they are aiming for everything, the researchers need to be able to pick good examples for any subject. No one can possibly do that. If we can figure out this meta learning in the next few years, there's a serious chance we will have a true post-human level intelligence in the next decade. It's frankly absolutely astonishing how far these networks are coming. They're literally already doing things that many people thought wouldn't happen for decades. People are massively underestimating these networks.

Reply

[-]

Full-Spectral@reddit

You are really projecting. So many people just assume that the mechanisms that have allowed this move up to another plateau is the solution and it's all just a matter of scaling that up. But it's not. It's not going to scale anywhere near real human intelligence, and even to get as close as it's going to get will require ridiculous resources, where a human mind can do the same on less power than it takes to run a light bulb and in thousands of times less space.

Reply

[-]

PopcornBag@reddit

> They are truly the first real general AI we have. Nope. Unless you think zip files and markov chains are were somehow rudimentary AI, then not even remotely close. > The feeling among many researchers at the moment is that these are going to turn into the first true high level intelligence. "Some ancient astronaut theorists say, 'Yes'." > Instead they just value whatever we value. Yeah, wonder why that is? Oh, right, because of how the entire process for "training"/encoding entails annotation and validation by humans. > I'm amazed how little people even here understand about these networks. At least we can agree that there's certainly an understanding issue here...

Reply

[-]

WhyIsSocialMedia@reddit

> Nope. Unless you think zip files and markov chains are were somehow rudimentary AI, then not even remotely close. Do you actually believe that these networks are actually as simple as Markov chains and zip files? They aren't remotely similar? > "Some ancient astronaut theorists say, 'Yes'." What a silly straw man? If you wanted to just call out a fallacy you would have been better off calling out an argument from authority. But that wasn't my argument, instead it's more that there's many arguments from them that there networks are extremely advanced but suffer heavily from a lack of direction in their meta training. > Yeah, wonder why that is? Oh, right, because of how the entire process for "training"/encoding entails annotation and validation by humans This is where the overwhelming majority of human intelligence comes from? It didn't come from you or me, it came from other humans? We've been working on our meta level intelligence for thousands to tens of thousands of years at this point. It takes us decades to get a single average individual up to a point where they can contribute new knowledge. Modern ML only has a very low degree of this meta understanding. And we know that himself humans that grow up without it also have issues - there's a reason the scientific method etc took us so incredibly long to solidify. There's very good reasons humans have advanced and advanced over time. It's really not related to any sort of increase in average intelligence, it's down to the meta we've created. Thankfully we already have large systems setup for this. > At least we can agree that there's certainly an understanding issue here... You literally called the modern networks Markov chains and zip files? You have no idea what you're talking about if you literally think that's all they are.

Reply

[-]

SanityInAnarchy@reddit

The irony here is, this is *exactly* the thing I'm criticizing: If I let it autocomplete an entire function body, that's where it's likely to be the most wrong, and where I'm most likely to ignore it entirely. ...I mean, unless the body is a setter or something.

Reply

[-]

Feriluce@reddit

Have you used Co-pilot at all? It kinda sounds like you haven't, because this isn't a real problem. You know what you want to do, and you can read over the suggestion in 5 seconds and decide if it's correct or not. Obviously you can't (usually) just give it a class name and hope it figures it out without even checking the output, but that doesn't mean it's not very useful in what it does.

Reply

[-]

WhyIsSocialMedia@reddit

Yeah these people seem like they will never be impressed. Of course you can't give any model (biological or machine) an ambiguous input and expect it to do better than a guess. How far these models have come in the last several years is frankly fucking absurd. There's so many things that they can do that almost no one seriously though we'd have in our lives. Several years ago I thought we wouldn't see a human level intelligence for at least 50+ years, but it seriously looks like we might potentially hit this in the next decade at this rate.

Reply

[-]

Full-Spectral@reddit

Not even close to the next decade. People are incorrectly seeing a step upwards (driven by the availability of what not long ago would have been considered crazily powerful hardware, and the burning of a LOT of energy) to make practical something that wasn't before. But it's not going to scale anything like human intelligence. Are we going to fill up whole midwestern states with server farms and suck up half the world's energy production? Are we going to put every single thing we do in the hands of a couple of massive companies and bet our lives on 100% uptime all the way from us to them? It'll take fundamentally different approaches to get close to real human intelligence, and to do it without sucking up all the energy we can make.

Reply

[-]

SanityInAnarchy@reddit

Yes, I have? If it's a solution that only takes five seconds to read, that's not really what I'm talking about. It does fine with tiny snippets like that, small enough I'm probably not splitting it off into a separate function anyway, where there's really only one way to implement it.

Reply

[-]

SuitableDragonfly@reddit

That's not what the person I responded to is describing. That's what they're saying is an inappropriate use of the tool because it tends to fuck it up.

Reply

[-]

SanityInAnarchy@reddit

Not a replacement, not exactly. It plugs into VSCode, and it's basically just a better autocomplete (alongside the regular autocomplete). But it's hard to get across *how* much better. If I gave it the above example -- that's cut off deliberately, if that's the "prompt" and it needs to fill in the function -- it's not just going to look at which variables I've used most recently. It's also going to guess variables with similar names to the arguments. Or, as in the above example, a function call (which it'll *also* provide arguments for). If I realize this is getting long: DoSomething(foo=thingX, bar=doBar(a, b, c, d, ... and maybe I want to split out some variables: DoSomething(foo=thingX, bar=barred_value ...it can autocomplete that variable name (even if it's one that doesn't exist and it hasn't seen), and then I can open a new line to add the variable and it's already suggesting the implementation. It's also fairly good at recognizing patterns, especially in your own code -- I mean, sure, DRY, but sometimes it's not worth it: a_mog = transmogrify(a) b_mog = transmogrify(b) I don't think I'd even get to two full examples before it's suggesting the rest. This kind of thing is *extremely* useful in tests, where we tolerate much more repetition for the sake of clarity. That's maybe the one case where I'll let it write most of a function, when it's a test function that's going to be almost identical to the last one I wrote -- it can often guess what I'm about to do *from the test name,* which means I can write `def test_foo_but_with_qux():` and it'll just write it (after already suggesting half the test name, even). Basically, if I *almost* have what I need, it's very good at filling in the gaps. If I give it a blank slate, it's an idiot at best and a plagiarist at worst. But if it's sufficiently-constrained by the context and the type system, that really cuts down on the typical LLM problems.

Reply

[-]

SuitableDragonfly@reddit

Aside from suggesting a name for a variable that doesn't exist yet, my IDE can already do all of that stuff.

Reply

[-]

SanityInAnarchy@reddit

Your IDE can already write entire unit tests for you?

Reply

[-]

SuitableDragonfly@reddit

No, but neither can Copilot. It works the way you describe, by suggesting the right things as I type.

Reply

[-]

SanityInAnarchy@reddit

Yes, it can. That's what I was talking about here: > That's maybe the one case where I'll let it write most of a function, when it's a test function that's going to be almost identical to the last one I wrote -- it can often guess what I'm about to do *from the test name*... ...and then it'll write the entire test case. Doesn't require much creative thought, because I already have a dozen similar test cases in this file, but this is something I hadn't seen tooling do for me before. It's closer to traditional autocomplete than the hype would suggest, but it's better than you're giving it credit for.

Reply

[-]

SuitableDragonfly@reddit

With autocomplete, adding a test case takes like five seconds. How much longer does it take you that you need copilot to do it for you? Also, if I had copilot generate the test case it would take longer, since it takes longer to verify that code is correct than it does to just write it in the first place for code like that.

Reply

[-]

SanityInAnarchy@reddit

Wait, didn't you just say this was something an IDE *can't* do for you?

Reply

[-]

SuitableDragonfly@reddit

No, I never said the IDE can't do autocomplete. Of course the IDE can do autocomplete.

Reply

[-]

SanityInAnarchy@reddit

I asked if it can write an entire unit test. First, you said: > No, but neither can Copilot. It works the way you describe, by suggesting the right things as I type. Now, you say this is a five-second task with autocomplete, which absolutely hasn't been my experience.

Reply

[-]

SuitableDragonfly@reddit

If you write your tests right, you define each test case as a struct/object/whatever your language likes to use here in a list of test case structs, and then you set up whatever mocks and fixtures you need, and write a loop that feeds your test case structs' data through whatever function you're testing one at a time, or in parallel. To add a test case, you only need to define a new test case struct.

Reply

[-]

SanityInAnarchy@reddit

> If you write your tests right, you define each test case as a struct/object/whatever your language likes to use here in a list of test case structs... Test tables, sure. Whether that's preferred depends what you're writing. Go almost demands it, because the language makes it so difficult to build other common testing tools like assertion libraries, and style guides tend to outright discourage that sort of thing. So since you can't write easily-reusable assertions or components, the only way you can avoid having each test case blow up into literally dozens of lines of *this* boilerplate: got := DoThing(someInputs) if got != want { t.Errorf("DoThing() = %+v, want %+v", got, want) } ...is to rely heavily on test tables, and then pile all the actual code into a big ugly chunk at the bottom. But this means if you have a few different tests that don't quite fit a test table, you end up either writing *way* more test code than you should have to, or you contort them into test-table form. I've noticed those structs tend to pick up more and more options to configure the test -- in pathological cases, they practically become scripting languages of their own. Where I was impressed with Copilot was an entirely different workflow -- think more like Python's [unittest](https://docs.python.org/3/library/unittest.html), or [Pytest](https://docs.pytest.org/en/7.4.x/). You can still easily use [parameterized tests](https://pypi.org/project/parameterized/) if you really do want to run *exactly* the same test a few different ways, kind of like you'd do for a Go test table. But more often, these encourage pushing the truly-repetitive stuff to either fixtures or assertion libraries, and still defining a test as an "arrange, act, assert" block. Something like: def test_something(self): # arrange some mocks self.enterContext(mock.patch.object(somelib, 'foo')) self.enterContext(mock.patch.object(somelib, 'bar')) result = test_the_actual_thing() # act self.assertEqual(some_expected_value, result) somelib.foo.assert_called_with(whatever) somelib.bar.assert_not_called() Which means most of the time, adding one or two more test-cases is going to mean adding a *similar* function, but not necessarily so similar that it could've just been parameterized. Or, at least, not so similar that it'd be more readable that way. But it's similar *enough* that with a file full of similar tests, Copilot is very good at suggesting a new one, especially if the function name is at all descriptive. Even in a dynamically-typed language like Python, and even if your team isn't great at adding type annotations everywhere.

Reply

[-]

SuitableDragonfly@reddit

Table tests aren't limited to Go, there's absolutely no reason you can't write them in any other language as well. Go does have some annoying test stuff and assertions are less streamlined, but that doesn't have anything to do with table tests. You can use table tests to reduce repeated code in any language. Sometimes if you find yourself writing a lot of repetitive code, the actual answer is to stop doing that, not to use an AI to write it for you, which is extremely error-prone.

Reply

[-]

SanityInAnarchy@reddit

It absolutely has something to do with table tests: In other languages, table tests are one option, and not usually anyone's first choice. In Go, they're pretty much mandatory. And I talked about this -- I get the feeling you stopped reading halfway through. > Sometimes if you find yourself writing a lot of repetitive code, the actual answer is to stop doing that, not to use an AI to write it for you... Sometimes. But, not all the time. Tests are the place I'm most likely to tolerate repetitive code, because it's usually more important that test code be clear and obviously-correct.

Reply

[-]

SuitableDragonfly@reddit

It's much, much easier to verify that a test is correct if it's not a bunch of repetitive code.

Reply

[-]

SanityInAnarchy@reddit

Not necessarily. We deal with repetition by adding abstractions and conditionals and other logic that can include bugs. It's a lot easier to spot a bug in the kind of test I just laid out. The advantage of a test table is you can add a test case with no code at all. The disadvantage is, if you do have to write some more code to support a new test case, you're making the actual contents of that loop more complicated and error-prone.

Reply

[-]

SuitableDragonfly@reddit

I'd rather have one loop to maintain and debug than 1000 lines of your code repeated over and over and over again with minute differences.

Reply

[-]

SanityInAnarchy@reddit

As with everything else, it's a matter of degree. A thousand lines of basically the same thing is obviously way too far. So is a thousand-line-long loop with dozens of conditions woven through to avoid having to write a new test function.

Reply

[-]

SuitableDragonfly@reddit

I haven't seen that happen in our codebase (except I guess in database tests, but those are necessarily nightmareish and there's no way to fix that), but it goes without saying that you should have one test function for each piece of code you're testing, and if you need a ton of conditions to test a single piece of code that starts to suggest that maybe that piece of code is doing too many different things and should be broken up.

Reply

[-]

svick@reddit

Except Copilot does not just autocomplete a single function or variable name, it writes at least a line of code, often more.

Reply

[-]

SuitableDragonfly@reddit

The person I'm talking to does not use copilot for this purpose, because they understand that it's complete shit at that.

Reply

[-]

Gearwatcher@reddit

> and is much better done without it. Tell me you haven't remotely used Copilot for this without telling me

Reply

[-]

SuitableDragonfly@reddit

It's not a matter of having used it or not. If you have a task where the input precisely determines what the output should be, that's a deterministic task that needs a deterministic algorithm, not an algorithm whose main strength is that it can be "creative" and do things that are unexpected or unanticipated.

Reply

[-]

Gearwatcher@reddit

The assumption that a task has precisely determined input and output in this case is the point where you are so wrong that it's inevitable you'll draw the wrong conclusion.

Reply

[-]

SuitableDragonfly@reddit

It depends on the use case. Some use cases call for stochastic algorithms, some call for deterministic ones. Generally the tradeoff is that deterministic algorithms will always be correct, and always be consistent, but are easily foiled by bad input, whereas stochastic algorithms will always give an answer regardless of input quality but it is not guaranteed to be correct. > previously deterministic AI met with combinatorial explosion of complexity that made it completely unviable. Sure, if you're talking about a chess algorithm. There are plenty of other use-cases where deterministic algorithms are perfectly fine and are in fact the better option. Including code generation.

Reply

[-]

Gearwatcher@reddit

The point I'm making is that fully deterministic is basically the same as overfitted. Code generation is very easily both things, as proven by Copilot. There is a huge problem space where "mostly correct but needs a bit of massage" for thousands of use cases is preferred to "completely correct for a small subset, and wildly incorrect for everything else".

Reply

[-]

SuitableDragonfly@reddit

Not really. Overfitting is when you train on dataset A, and that dataset is insufficiently general such that the system has very different performance on stuff that is similar to A versus stuff that isn't. This concept just isn't relevant to deterministic algorithms. Deterministic algorithms by definition always output the same thing for the same input, and the output is worth correct or incorrect, it's not "better" or "worse". If you've made it correctly, it's not incorrect about anything, it just sometimes doesn't have an answer due to lack of appropriate input. When it comes to coding, we're already very used to the compiler saying sorry, you made a single typo so I can't produce any output for you. Compilers, along with basically every other tool we use for dealing with code, are deterministic algorithms, because handling code is a perfect use case for deterministic algorithms.

Reply

[-]

Gearwatcher@reddit

I find your lack of imagination disturbing. A "SYNTAX ERROR" is "wildly incorrect" output still. A function is just a map between two value spaces.

Reply

[-]

SuitableDragonfly@reddit

No, a syntax error is the correct output for incorrect syntax. Are you saying you'd prefer an unreliable compiler over one that requires correct syntax?

Reply

[-]

Gearwatcher@reddit

You are the one who introduced compilers into this, because you don't have valid arguments. Let's reiterate what this was about all the time: > There is a huge number of problem spaces where "mostly correct but needs a bit of massage" for thousands of use cases is preferred to "completely correct for a small subset, and wildly incorrect for everything else".

Reply

[-]

SuitableDragonfly@reddit

Yes, there's are use cases for stochastic algorithms. And there are also use cases for deterministic algorithms, such as basically anything to do with working with code. You can use whatever tools you want, dude, I never said otherwise. It's totally up to you if you want to use the wrong tools for the job. I introduced compilers because they're a perfect example of why you don't want to use stochastic algorithms for working with code. If you are so butthurt by the compiler telling you you have a typo that you'd rather it give you wrong output, you really shouldn't be working in this field. Please go get a job doing something you actually like, you'll probably be much better at it.

Reply

[-]

QuickQuirk@reddit

I think you should try it. I was sceptical too, then I tried it, and it's surprisingly good. It's not replacing me, but it's making me faster, especially when dealing with libraries or languages I'm not familiar with.

Reply

[-]

Gearwatcher@reddit

If you write a comment and expect it to output a function then yes, it's a shitshow and you're likely to get someone else's code there. But if you use it as Intellisense Plus it does orders of magnitude better job than any IDE does. Another great thing it does is generate unit tests. Sure, it can botch them but you really just need to tweak them a little, and it gets all the circuit-breaker points in the unit right and all the scenarios right which is the boring and time consuming part of writing tests for me.

Reply

[-]

drekmonger@reddit

Github Copilot is powered by a GPT model that's finetuned for coding. Most recent version should be GPT-4.

Reply

[-]

thelonesomeguy@reddit

> most recent version should be GPT 4 Does that mean it supports image inputs as well now? Or still just text? (In the chat, I mean)

Reply

[-]

WhyIsSocialMedia@reddit

That would depend on exactly what they did to optimise it. But yes the model can do that. This is really one of the reasons so many researchers are calling these AI. They don't need specialized networks to do many many tasks. Really these networks are incredibly powerful, but the current understanding is that the problems with them are related to a lack of meta learning. Without this they have the ability to understand meaning, but they just optimise for whatever pleases the humans. Meaning they have no problems misrepresenting the truth or similar so long as we like that output. This is really why githubs optimisations work so well. Meanwhile the people who trained e.g. ChatGPT are just general researchers, who can't possibly keep up with almost every subject out there. Really we could be on the way to a true higher than human level intelligence in the next several years. These networks are still flawed, but they're absurdly advanced compared to just several years ago.

Reply

[-]

Stimunaut@reddit

> they have the ability to understand meaning No, they don't. There is 0 *understanding*, because there is no underlying awareness. Hence why they suck at inventing solutions to new problems.

Reply

[-]

WhyIsSocialMedia@reddit

> There is 0 understanding I don't see how anyone can possibly argue this anymore? They can understand and extract (or even create) meaning out of things that weren't ever in their training data? They can now learn without even changing their weights as they essentially have a form of short term memory (though far far far better than us due to how our ANNs are still based on reliable silicon). We've even made some progress on removing the black box from these networks. And what we've seen is that they have neurons that very clearly represent high level concepts in the network. These neurons are simply objectively representing meaning? To say they aren't is absurd. > because there is no underlying awareness We simply don't know this? You can't say whether a network does or doesn't have any underlying awareness. Personally I find the idea that only biological neurons have any awareness simply doesn't line up with everything we understand about physics, and also just seems arrogant. That doesn't mean these networks have as consistent or as wide an experience and awareness as us, I don't believe that (at least not at the moment). But surely you can see how believing that there's some special new property that emerges when you line up atoms in the form of biological neural networks, yet doesn't exist in any other state simply isn't supported by any science. There's simply absolutely zero emergent behaviour we've seen that isn't just a sum of it's parts, so the idea it simply emerges only in these high level biological networks is absurd from that angle. That said we have virtually zero understanding of this. So I could very easily be wrong here. If I am though I think it's much more likely that it's still not emergent but instead based on something else like complexity. The alternative is the universe simply **massively** changes it's behaviour/structure/complexity when it comes to this. It's also not clear that awareness has any impact on computability or determinism. In fact given the scale and energy levels of neurons it seems pretty clear that awareness can't have any impact on what the network does. This would mean it doesn't even matter if the ANNs (or even some biological networks) are aware, they'd generate the same output no matter what. The only place we've ever seen (assuming quantum mechanics is local which isn't actually known) non-computability is at the quantum level. But even that is only random number generation, a far cry from awareness that can directly impact outcomes in a free will styled way. If it's not random then you also get serious problems with causality and the conversation of information. > Hence why they suck at inventing solutions to new problems. So do most humans? There's a reason there's such a push for meta learning in modern ML. Our success as a species (just in terms of how far we've advanced) very clearly is from our very very very advanced meta learning, which we've spent tens of thousands of years perfecting, and yet still takes decades to implement on a per human basis. The overwhelming majority of our advancements are small and incremental, it's pretty rare you get someone like Newton or Einstein (and even then they were very clearly still based on thousands of years of previous advancements). These networks are actually well above the average human capability in terms of answering new questions when you do very good fine training of the application. The problem is if you don't do this well the networks simply don't value things like truth, working ideas/code/etc, any sort of reason or rationality, etc etc. This again isn't any different than humans, as the vast majority of people will also simply value what they were grown up with. It's literally the reason cultures vary and change so massively over time and location. Again since our meta learning is so poor for ML (especially with things like ChatGPT that simply have to currently use general researchers for deciding what outputs to value) the models simply don't properly value what we do, they simply value whatever they think we want to hear. Finally while modern models very very clearly have a much much wider understanding than us, they definitely don't have as deep an understanding as a human who has put years into learning something specific. This does appear to be a scale + meta issue though, as the networks just aren't large enough still, especially thanks to how much wider their training data is (humans simply don't have enough time to take in this wide of an experience due to how slow biological neurons are and the limits of our perception (and just physical limits)).

Reply

[-]

Stimunaut@reddit

Lol. The funniest thing out of all of this, is seeing people who don't know anything about machine learning, or neuroscience for that matter, pretending that they do. Please go and look up the meaning of "understanding," and then we'll have a conversation. Until then, I won't waste my time attempting to convey the nuances of this topic to a layman.

Reply

[-]

WhyIsSocialMedia@reddit

So you just literally ignore all my points and instead of looking at the merit you just use an argument from authority?

Reply

[-]

Stimunaut@reddit

Essentially. Your "frankly absurd" level of ignorance and "virtually 0" experience in this area became apparent around paragraph 3, which was when I stopped reading. I work with LLM's every day, have built feed-forward/recurrent/etc. neural networks to solve various problems, and I work alongside colleagues with PHD's in machine learning. Our running joke is that we could draw a brain on a piece of paper, and it would eventually convince a substantial enough portion of the population (like yourself) that it's conscious, given enough layers of abstraction. What's ironic is that LLM's would be terrible at convincing you lot of their "understanding," if not for a couple of very neat tricks: vector embeddings and cosine indexes. The reason these networks excel at generating cogent strings of text is largely (mostly) thanks to the mathematics of semantic relationships. It's not that difficult to stitch words together (that imply an uncanny sense of meaning) when all you have to do is select from the best options presented to you. But please, enlighten me: at which part during the loop, does the current instance of the initialized weight table, responsible for choosing the next best word (given a few dozen options in the form of embeddings), develop a sense of understanding? I'm dying to know.

Reply

[-]

WhyIsSocialMedia@reddit

You really expect me to respond to this when you *still* won't respond to my initial argument?

Reply

[-]

thelonesomeguy@reddit

Did you reply to the wrong comment? I’m very well aware what the GPT 4 model can do. My question simply needed a yes/no answer which your reply doesn’t give

Reply

[-]

ikeif@reddit

[Yes.](https://help.openai.com/en/articles/8400551-image-inputs-for-chatgpt-faq) But maybe not in the way you’re wanting? So it’s possible if you have a specific use case the answer may be “not in that way.” (I have not tried playing with it yet)

Reply

[-]

thelonesomeguy@reddit

I was thinking more of using flowcharts or ER diagrams for improving context for the queries

Reply

[-]

drekmonger@reddit

If you have ChatGPT Pro account, yes, there's access to GPT-4V. I don't believe that's presently true for Github Copilot. I'm not currently subbed to it, so I can't check, but it wasn't there before I don't recall any announcements that vision was being added. But with GPT-4V via ChatGPT, yes, you could upload a flowchart or ER diagram and ask the model to write code based on the chart. It's a crapshoot whether or not it will actually be useable code on the first draft. You have to work with the model to debug afterwards, usually.

Reply

[-]

Old_Conference686@reddit

Eh to some extent, for whatever reason the autocomplete is just botched whenever you deviate from standard lib stuff and introduce you own stuff on top of the library. I use it for the primarly for autocomplete purpose

Reply

[-]

killerstorm@reddit

Copilot is 100% LLM.

Reply

[-]

Metal_LinksV2@reddit

I work in a very niche field but I tried Bard and ChatGPT a few times and even on a generic regex prompt it failed.

Reply

[-]

Atulin@reddit

"Here's a C# class, I'd like you to turn all private fields into public properties" "Here it is..." "You forgot some" "I'm sorry, here it is..." "Still missing some" "I'm sorry. Here is all fields turned into properties..." "Still not all of them" "I'm sorry, here is..." At this point I wrote 5 lines of Python that just did it all in a split second.

Reply

[-]

OpalescentAardvark@reddit

> even on a generic regex prompt it failed. Perfect example of using a hammer to turn a screw. These common LLMs are designed to answer a simple question: "what's the next *most likely* word to pump out?" It's not designed to "think" or solve math equations or logically reason about a problem. Regex is a logic puzzle based on certain rules. LLMs aren't designed to work out what kind of puzzle something is.

Reply

[-]

BibianaAudris@reddit

LLM works great if someone in the training data already solved the puzzle, though, which is true for common regex questions. More than that, when A had a solution for half the puzzle and B solved the other half, LLM can stitch them together and happen to produce the right answer, which is genuinely more useful than a search engine. The problem is such stitching can also produce crap, and it's hard to tell which is which.

Reply

[-]

atthereallicebear@reddit

well, they are general purpose ai's, and it's not really a problem of their architecture that stops them from doing regex. their approach is perfectly applicable if they are trained long enough and have enough computing power for billions of parameters. it's like saying "human brains evolved just to figure out what muscle movements they should make based on sensory input." Sure, that is technically true but the behavior that emerges from that question is very complex, and allows us to write regex.

Reply

[-]

Kubsoun@reddit

difference between humans and ai is that humans are actually capable of inventing stuff, small difference but might be a key to why ai sucks dick at regex and works okayish as being genz google

Reply

[-]

atthereallicebear@reddit

so you are saying ai cant invent stuff? of course it can. just ask it to invent a story, or just ask it to invent an invention. it will do it. maybe it won't be a very good invention but it still invented something.

Reply

[-]

johnphantom@reddit

Yeah LLMs are wise, not intelligent.

Reply

[-]

rommi04@reddit

No they are confident idiots

Reply

[-]

cdsmith@reddit

I think experiences can vary here. I use GPT-4 all the time for mathematics. It absolutely doesn't understand anything, but it can talk through problem solving alright, and is only occasionally wrong enough that it is more of a harm than a hindrance. Do I trust anything it says? Of course not. Are most of its suggestions helpful? Definitely not. I'm definitely in "skim and see if anything sticks out as useful" mode. But I find it helpful just have a conversation in which I can say things and get some kind of immediate feedback that structures my own thought process. It also helps with feeling better, since it doesn't take make for GPT-4 to tell you that your ideas are insightful, original, and show a deep understanding of your subject. :)

Reply

[-]

markehammons@reddit

Asking gpt what the 201st prime plus the 203rd prime gets consistently wrong answers in my experience. That's not even hard math, just basic addition and looking up numbers in a table

Reply

[-]

cdsmith@reddit

Ah, but there's a big difference between calculation and math.

Reply

[-]

Kindred87@reddit

Recent models can perform math via Python. Example: https://chat.openai.com/share/0afc763f-6c77-4ba1-b7f6-05e4914ce24d

Reply

[-]

LittleLui@reddit

That sounds like rubber duck debugging with a talking rubber duck.

Reply

[-]

SuitableDragonfly@reddit

That's basically all a chatbot is, really, just a talking rubber duck. Takes us full circle right back to ELIZA.

Reply

[-]

LittleLui@reddit

>That's basically all a chatbot is, really, just a talking rubber duck. Takes us full circle right back to ELIZA. Tell me more about that. /s

Reply

[-]

Ok-Tie545@reddit

I'm not sure I understand you fully

Reply

[-]

Tasgall@reddit

A rubber duck that understands nothing but also has the entirety of Wikipedia and open source GitHub memorized, so it can spit out the right answer even though it doesn't really understand the question.

Reply

[-]

FloydATC@reddit

It is, but once you understand and respect this simple fact, GPT can be an immensely useful tool for figuring things out. Quite unlike its mute counterpart, it can introduce aspects of the problem that you didn't know existed. The problem is still your puzzle to solve, but now you have the missing piece.

Reply

[-]

Venthe@reddit

> it can introduce aspects of the problem that you didn't know existed. The problem is still your puzzle to solve, but now you have the missing piece. Unfortunately, it also introduces you to subtle errors you didn't know could exist. As a junior, you are far better off ignoring LLM's completely, as you _need_ to understand. As a senior, coding is only a post-factum of a design. You _need_ to understand - fully - what it spews out, or else you are in a whole another world of trouble.

Reply

[-]

LawfulMuffin@reddit

Its pointed me to substantially better solutions in the past. It’s really good at doing x/y stuff. “Write me a function that does ABC” may yield: sure, I can do that and also you might want to just use this off the shelf thing that does that and here’s the code for that”.

Reply

[-]

treasonousToaster180@reddit

hard agree on the time wasting. I'm working on a project using a heavily documented open standard and asked it to generate a bunch of junk messages for me to pass through just to test my ability to take in data. I looked them over and it seemed fine, but I didn't realize until after spending like 3 hours working out a datetime parser that it was using the *wrong format for date and time values.* They're similar enough it wasn't immediately evident when I looked it over but different enough I had to spend another two hours revising the regex used to validate the input. Never using that shit again.

Reply

[-]

wtallis@reddit

For programming and math, I have a sliver of hope for the long term: we can demand that machine-generated answers also be machine-*verifiable*. Automated proof checkers already exist, but are too tedious for humans to bother with in most cases. But it's quite reasonable to want an AI/LLM to emit output that can be run through such tools. For a typical StackOverflow answer, it's not worth the trouble for a human to wrap the answer in an entire program that compiles, and runs some automated tests to demonstrate its own correctness, but that's a standard that bots should aspire to.

Reply

[-]

killerstorm@reddit

> Way better tools out there for this than LLM. Such as...? LLM might not help you to prove a theorem, but it might help to translate a theorem into a formal language where it can be processed by a theorem prover software. So it's rather complementary. And Terrence Tao (one of the world's best mathematicians) is rather optimistic about where it's going: "I expect, say, 2026-level AI, when used properly, will be a trustworthy co-author in mathematical research, and in many other fields as well."

Reply

[-]

my_aggr@reddit

You just run the code against your test cases. If it's wrong it fails. If it's right it passes.

Reply

[-]

danstermeister@reddit

Got any advice on how to be a better neurosurgeon?

Reply

[-]

my_aggr@reddit

Don't use a mallet.

Reply

[-]

Wang_Fister@reddit

Mighty bold of you to assume I have test cases 😤

Reply

[-]

SuitableDragonfly@reddit

It's working perfectly fine for the people using it - it generates clicks. That's all they want, they don't actually care about having comprehensible content. 20 years ago people were generating the entire contents of their website for the same purpose for pennies using Amazon Mechanical Turk, nowadays they're just using AI.

Reply

[-]

starlevel01@reddit

I've found the one situation where I can tolerate copilot is when writing out manual serialisation code; I can just start the function header for the opposite function and it'll fill it out properly. Otherwise it's useless.

Reply

[-]

NotUniqueOrSpecial@reddit

I've been reimplementing the serialization layer for a very large and *very* legacy/poorly implemented codebase and this has been my takeaway as well. I can trivially slice/dice the appropriate (and prolific) hard-coded magic strings out of the existing code and create corresponding helper structs/mapping functions using multi-cursor editing and a bit of finesse. But at the end of the day, I still need to put down the final switch statement for the 20-50 members of each type to actually map that data. Copilot's done a really decent job of turning my first few lines of input into a complete mapping for the most part. I still have to check the results (especially because it sometimes makes reasonable but incorrect choices about which members to map to), but even so, it's saved me hours over the last few days.

Reply

[-]

Piisthree@reddit

Automated tools generating manual work. Kind of our worst nightmare.

Reply

[-]

covfefe-boy@reddit

I'm a programmer and I've been working with a new piece of software lately. And I of course google for answers on how to do things in this new framework. I kept coming to the same site, it's almost always at the top of the google results. And while at a glance it looks right, it was always wrong. Always, in the step-by-step directions I was wondering if I had an older version of the software or something. And there's just this huge article of text after the how-to step-by-step guide that always felt eerily off to me. I mentioned it in our slack chat to the other devs and one said he's seen similar things (on other tech) and it's usually an AI generated article base. I looked back at the site, and sure enough there was a subtle header saying this is all generated by AI and not necessarily accurate. AI is great, I love it, I work with it, but it's not quite at the replacing people stage yet. At least not all people. It might never get there. Frankly I believe if we ever let it talk to the customer it'd come running back to us programmers in tears, so I've got no worries I'll ever be out of a job.

Reply

[-]

jimmux@reddit

I learned how pervasive AI content is when I went looking for medical advice. Last month I had a stitched up wound that wouldn't stay closed, so I was trying to find info on how best to clean and bandage it. High in the results were sites with domain names like "stitchclean.com", and such. Bizarrely specific. The content was paragraph after paragraph of internally inconsistent advice, punctuated with ads. I pretty much gave up and followed my instincts with a little empirical experimentation. It worked out eventually, but I hate to think what people with more serious and urgent medical needs are doing to themselves, with full confidence because a site like "diabetesdiet.com" must be the best resource, right?

Reply

[-]

teslas_love_pigeon@reddit

Odd Q but have you thought about checking out, or buying, some medical texts? A good physical field medicine book is worth owning. The US Army has one that is affordable, there are other decent texts as well. I feel like human curation of content is going to be worth more in the future.

Reply

[-]

jimmux@reddit

I spent the last several years downsizing, getting rid of the books I carried around for years. Now I'm realising how valuable they were. Wish I knew where my SAS Survival Handbooks ended up.

Reply

[-]

RabbitNET@reddit

Be wary though - Plenty of books are full of AI garbage these days, too. Self-publishing on Amazon is being hit by it pretty hard.

Reply

[-]

teslas_love_pigeon@reddit

oh for sure, I guess I should have also mentioned be sure to check out older books. The field medicine book I have is from 2015 and it's pretty handy to help teach children/teenagers how to dress or clean a wound or make a splint (and why you'd need a splint). I do worry that the next few years are going to be extremely rough from a search and learning perspective. If you don't know how to differentiate from bad material with good, it's going to very hard for you to excel. Someone else mentioned the LLM spew will be the DDOS attack on knowledge sharing.

Reply

[-]

lowPolyReasonings@reddit

technically this is a google problem. They promote shovelware with their crap engine.

Reply

[-]

TarMil@reddit

It's both really. Shovelware generation sucks, and Google sucks for promoting it.

Reply

[-]

lowPolyReasonings@reddit

its 100% google. They created the internet we have today with their biased relevance algorithm. It's utterly unusable. I long for an internet without the censorship and force feeding of the abysmal ideologies of the tech giants. We live clutching to our devices in this echo chamber of a world where not quality matters but quantity and minorities and screamers have the last say in every matter. It has completely blunted our wits and we are slowly decaying into a world ruled by stupidity and loud gestures.

Reply

[-]

teslas_love_pigeon@reddit

I wish I could find the hn comment, but there was an ex-Google worker that mentioned sometime in 2015/2016 Google search relied less on AI/ML but when one particular president left they went all in on AI based search. They blamed Google's poor search results on this decision. I don't know the exact timelines for myself on noticing the enshittification, but it really feels like the last 4 or 6 years have had garbage search results. I remember searching programming problems and being directed to individual blogs on solving the issue; now it's like 85% SO clones that just scrap results and seem to be mad solely for ad revenue. As a result I don't Google search anymore, I go directly to the documentation and read more tech books. Which I suppose is better, but I feel bad for others that don't have the same time.

Reply

[-]

Coffee_Crisis@reddit

google is less and less useful every day, it's really upsetting

Reply

[-]

Sigmatics@reddit

Now imagine future generations of LLMs being trained on LLM answers on StackOverflow. We have come full circle

Reply

[-]

DirectorBusiness5512@reddit

AI-generated junk is the information equivalent of nuclear fallout

Reply

[-]

takanuva@reddit

I'm gonna start using the expression "a DOS attack against humanity" from now on, if you don't mind.

Reply

[-]

imthebear11@reddit

The worst is when someone is asking something on Reddit and some absolute genius responds with, "According to ChatGPT, ...."

Reply

[-]

elsewen@reddit

No. The worst is when they just post the hallucinated crap without saying that. If they lead with "according to ChatGPT", it's fine because you can effortlessly ignore whatever comes after

Reply

[-]

dimbasaho@reddit

> The worst is when they just post the hallucinated crap *without* saying that. Like a certain serial poster in this sub who attaches ChatGPT summaries to every post and stopped disclosing that it was LLM garbage.

Reply

[-]

imthebear11@reddit

Good point lmao. At least they call out when they're being a useless idiot

Reply

[-]

Behrooz0@reddit

The worst part is I once got like -78 votes because I claimed to be a domain expert and that the chatGPT answer is wrong. and gave examples. There were many many kids claiming I'm an old geezer trying to stop the advancement of AI because I feel threatened.

Reply

[-]

Thatdudewhoisstupid@reddit

Oh my god, r/singularity has been popping up on my feed lately and it's populated by those exact same kids. It feels like I live in a different world from the AI crowd.

Reply

[-]

Behrooz0@reddit

That's an easy fix. get yourself banned with a bang:)

Reply

[-]

Coffee_Crisis@reddit

reddit is increasingly full of idiots who downvote anything that makes them sad regardless of whether it's helpful or true

Reply

[-]

Venthe@reddit

I'm actually glad. Because at some point, the hammer of reality will drop, and it will drop _hard_

Reply

[-]

oalbrecht@reddit

According to ChatGPT, I should respond to your comment like this: *You can respond with humor, saying something like, "Well, blame it on ChatGPT – it's just trying to be the wise sage of Reddit!" Or, you could clarify that while ChatGPT can provide information, it's always good to cross-check with other sources for accuracy.*

Reply

[-]

Paulus_cz@reddit

I frequent certain programming discord channel which has help section, whenever you post a question it will create a post and pass it to ChatGPT to attempt an answer, which will get dropped into the post. There is a lot of certified fresh programmers there so some questions are really basic and easily answered by ChatGPT, freeing senior programmers to answer the actually meaty ones. I think that is the best use of it I have seen yet, useful, but supervised so it does not spew bullshit on people who do not know better.

Reply

[-]

MohKohn@reddit

The labeled ones are worth a good laugh usually.

Reply

[-]

GrinningPariah@reddit

I'm increasingly convinced the only important, helpful, and ethical use of LLMs will be to detect content made by LLMs so humans don't have to see it.

Reply

[-]

SettingFunny9029@reddit

iirc curl author used to have his "pronouns" in his profile, so just that such person is generally unreliable.

Reply

[-]

panenw@reddit

it will get worse before it won't get better

Reply

[-]

RedPandaDan@reddit

I worked for 5 years in an insurance call center. Most people believe call centers are designed to deliberately waste your time so you just hang up and don't bother the company; there is nothing I could say that would dissuade you of this, because I believe it too. In the future, we're all going to be stuck wrestling with AI chatbots that are nothing more than a stalling tactic; you'll argue with it for an age trying to get a refund or whatever and it'll just spin away without any capability to do anything except exhaust you, and on the off chance you do have it agree to refund you the company will just say "Oh, that was a bug in the bot, no refunds sorry!" and the whole process starts again. A lot of people think about AI and wonder how good it'll get, but that is the wrong question. How bad will companies accept is the more prescient one. AI isn't going to be used for anything important, but it 100% is going to be weaponized against people and processes that the users of AI think are unimportant: companies who don't respect artists and other creative roles will have Midjourney churn out slop, blogs that don't respect their visitors will belch out endless content farms to trick said visitors into viewing ads, companies that don't respect their customers will bombard review sites with hundreds of positive reviews, all in different styles so that review site moderators have no way of telling whats real or not. AI is going to flood the internet with such levels of unusable bullshit that it'll be unrecognizable in a few years.

Reply

[-]

MrChocodemon@reddit

> In the future, we're all going to be stuck wrestling with AI chatbots All ready had the pleasure when contacting Fitbit. The "ai" tried to gaslight me into thinking that restarting my Smartwatch would achieve my desired goal... I was just searching for a specific setting and couldn't convince the bot that I 1) I already had restarted the watch ("just try it again please") 2) That restarting the watch should never change my settings, that would be horrible design It took nearly and hour for me to get the bot to refer me to a real human who then helped fix my problem in less than 5 minutes...

Reply

[-]

Coffee_Crisis@reddit

just lie and say you did it

Reply

[-]

Stimunaut@reddit

Or, hear me out, buy one less device that you intend on strapping to yourself at all times (which ultimately just serves to annoy you).

Reply

[-]

Nesman64@reddit

"I understand. As the next step, please restart the device."

Reply

[-]

MrChocodemon@reddit

That just caused a loop, where it insisted on me trying again.

Reply

[-]

Coffee_Crisis@reddit

Keep going!

Reply

[-]

MrChocodemon@reddit

That just caused a loop, where it insisted on me trying again.

Reply

[-]

DirectorBusiness5512@reddit

That will be the moment when using the paper mail to get a refund is faster lmao If there is no fast way to get a refund in the future, maybe companies will get hit with so many chargebacks from credit cards that they are forced to make returns easier

Reply

[-]

Agitates@reddit

It's a different kind of pollution. A tragedy of the commons.

Reply

[-]

crabmusket@reddit

I agree with your sentiment, but it's not a tragedy of the commons (a dubious concept in any case). Maybe a market failure.

Reply

[-]

GenTelGuy@reddit

Tragedy of the commons is dubious in general? Isn't climate change via greenhouse gas emissions a textbook example?

Reply

[-]

crabmusket@reddit

Wiki has a good summary of the concept including criticism: https://en.wikipedia.org/wiki/Tragedy_of_the_commons#Criticism Basically, wherever the phrase is used, it's typically not in reference to a commons. The entire atmosphere of planet earth, in the climate change example, is nothing like a commons. The "tragedy" referred to is that no one user of the "commons" resource has the incentive to moderate their use of it. This is simply not the case when the situation is as asymmetric as e.g. the interests of the owners of fossil fuel companies versus the interests of Pacific island nations.

Reply

[-]

IrritableGourmet@reddit

>Basically, wherever the phrase is used, it's typically not in reference to a commons. The entire atmosphere of planet earth, in the climate change example, is nothing like a commons. No offense, but that sounds like etymological pedantry. It's like saying you can't use the phrase "it was their Waterloo" if they weren't commanding a major land battle with horse cavalry. >The "tragedy" referred to is that no one user of the "commons" resource has the incentive to moderate their use of it. That's what's going on with the climate change example. No one company/country is incentivized to moderate their usage because other companies/countries don't/won't, and it has an economic cost. It's the asshole version of a Nash equilibrium. You actually see this a lot in discussions on environmental regulations: "Yeah, electric cars are great, but China's still going to be polluting a lot, so it doesn't matter."

Reply

[-]

crabmusket@reddit

> No offense, but that sounds like etymological pedantry. None taken, that's exactly what it is! I don't agree with your Waterloo characterisation though. Using the phrase "tragedy of the commons" reinforces the idea that this kind of thing is natural and inevitable. It's not, and we're able to choose to improve things. > You actually see this a lot in discussions on environmental regulations: "Yeah, electric cars are great, but China's still going to be polluting a lot, so it doesn't matter." You do see this a lot, but it's just scapegoat rhetoric.

Reply

[-]

IrritableGourmet@reddit

> Using the phrase "tragedy of the commons" reinforces the idea that this kind of thing is natural and inevitable. It's not, and we're able to choose to improve things. Yes, but the only stable solution is if *everyone* (or most everyone) chooses to change, hence the reference to a Nash equilibrium (If each player has chosen a strategy – an action plan based on what has happened so far in the game – and no one can increase one's own expected payoff by changing one's strategy while the other players keep theirs unchanged, then the current set of strategy choices constitutes a Nash equilibrium). For example, if only one non-monopoly company decides to go green, then that strategy will likely cost them significantly more in expenses than their competitors, giving their competitors an economic advantage and making it more likely that they will gain more of the market through their non-green approach, negating that one company's efforts. The only way for it to work is for either (a) the government steps in and enforces regulations, (b) they find a way to make more money from an environmental approach than a polluting one, or (c) they all agree to participate.

Reply

[-]

crabmusket@reddit

I think that the concept of a Nash equilibrium does apply more aptly to climate change than does tragedy of the commons. However, it's still an oversimplification of an incredibly complex ecosystem (which in the case of climate change comprises nearly all of human activity)... and if the oversimplification serves the purpose of making it seem like change is impossible or extremely difficult, then I'd question the usefulness of using it. If you're a person trying to enact change, you might want to analyse your immediate environment - and if it looks like a Nash equy, what does that tell you about the levers you need to pull to effect change? But maybe the situation is more complicated than that, or maybe your local environment does not look like a Nash equilibrium, or it does but it's not as rigid as the theoretically pure version of the problem. Homo economicus doesn't really exist, and there's always leeway between "less economically competitive" and "not economically competitive".

Reply

[-]

Agitates@reddit

I'm not going to stop using that phrase until a better one that most people know of comes along.

Reply

[-]

crabmusket@reddit

What we have here is a collective action problem. If nobody wants to use a better phrase until the better phrase is popular, it won't become popular! And I'd argue that "collective action problem" is often more apt than "tragedy of the commons" depending on the actual event being described.

Reply

[-]

Coffee_Crisis@reddit

Everyone publishing AI slop into their own corner of the Web because they can glean a tiny bit of ad revenue from doing so, and therefore making the Web suck for everyone, seems close enough to the actual tragedy of the commons scenario that it's still a worthwhile use, if the 'commons' isn't actually a real enough thing for the idiom to be useful then why insist on being so strict about it, this seems pedantic

Reply

[-]

ALittleFurtherOn@reddit

To put it simply, it is the end result of the ad-funded model. Collectively, we are too cheap to pay for anything … this is what you get “for free.”

Reply

[-]

SanityInAnarchy@reddit

This is already what it feels like to call Comcast. Their bot is only doing very simple keyword matching, but its voice recognition sucks so much that I have shouted "No! No! No!" at it and it has "heard" me say "yes" instead. Amazon is the exact opposite: No matter what your complaint is, about the only thing either the bots or the humans are willing to do is issue refunds.

Reply

[-]

Captain_Cowboy@reddit

That's because Amazon is actually just providing cover for a bunch of bait-and-switch scams. Providing a refund isn't much help getting you the product at the price they advertised. "Yes, we run the platform, advertise the product, process the payment, provide the support, ship it, and are even the courier, but they're a 3rd party, so we're not responsible for their inventory. And we don't price match."

Reply

[-]

SanityInAnarchy@reddit

I mean, they are also delivering a *lot* of actual products. It's more that delivering those refunds is the quickest way they can claw back *some* goodwill, and it's infinitely easier than any of the other things they could do. For example, I don't think they're even *pretending* to ask you to ship the thing back anymore.

Reply

[-]

turtle4499@reddit

Amazon tried to get me to ship back an illegal medical device they sold me…. Having to explain to someone that I will not be mailing the device labeled prescription only that was also sent in the wrong size and model type was a slightly insane convo. Me just being like u understand this is evidence and illegal for me to mail correct?

Reply

[-]

McMammoth@reddit

What was it?

Reply

[-]

Coffee_Crisis@reddit

for now it usually works to just say 'operator' or 'give me an agent' or 'supervisor' and refuse to answer the actual questions, but that will end when enough people start doing it

Reply

[-]

c00a5b70@reddit

Sadly I have but one vote to give to this comment. I’d say it was prescient, but it’s already happening on low quality sites where long, pointless, fluff pieces are posted with no dates and bylines to provide space for ads.

Reply

[-]

RedPandaDan@reddit

I genuinely believe that the future of the internet is going to be small enclaves of a few hundred people on invite-only message boards, anything else is going to have you stuck dealing with tidal waves of bullshit.

Reply

[-]

stahorn@reddit

The root cause of problems like this is of course a legal one. If it's legal and beneficial for a company such as an insurance one to drag out these types of communications to pay out less to their customers, they will always do so. The solution is then of course also legal: Make it a requirement that insurance companies provide a correct and quick way for their customers to report and get their claims.

Reply

[-]

MohKohn@reddit

As someone who interacts with phone trees way too often, this is the use-case that has me the most worried. We definitely need legislation that charges companies for wasting customer's time.

Reply

[-]

slvrsmth@reddit

This is the future I'm afraid of - LLM generating piles of text from few sentences (or thin air, as is this case) on one end, forcing use of LLM on receiving end to summarise the communication. Work for the sake of performing work. Although for me all these low-effort AI generated text examples (read: ones where author does not spend time tinkering with prompts or manually editing) stand out like a sore thumb - mainly the air of politeness. I've yet to meet a real person that keeps insisting on all the "ceremonies" in the third or even second reply within a conversation. But every LLM generated text seems to include them by default. I fear for the day when the models grow enough tokens to comfortably "remember" whole conversations.

Reply

[-]

Cautious-Nothing-471@reddit

> Work for the sake of performing work. sounds like bitcoin

Reply

[-]

pure_x01@reddit

The problem is that as soon as these idiots realise that they can’t just send llm output as it is they will learn that they need to just instruct the llm to write in a different text style. It will be impossible to detect all llm crap. The only thing that can or perhaps should be done is to set requirements on the reports. They have to be short and clear and make it easy to understand the issue. Then at least it will be quicker to go through them.

Reply

[-]

jdehesa@reddit

Exactly. A lot of people who look very self-content saying they can call out LLM stuff from miles away don't seem to realise we are at the earliest of this technology, and it is having a huge impact in many domains already. Even if you can always tell right now (which is probably not even true), you won't soon enough. A great deal of business processes rely on the assumption that moderately coherent text is highly unlikely to be produced by a machine, and they will all be eventually affected by this.

Reply

[-]

blind3rdeye@reddit

Not only that, but also the massive effect of confirmation bias. Imagine, you see some text that you think is LLM generated. You investigate, and find that you are right. So this means you are able to spot LLM content. But then later you see some content that you don't think is LLM generated, so you don't investigate, and you think nothing off it. ... People only notice the times that they correctly identify the LLM content. They do not (and cannot) notice the times when they failed to identify it. So even though it might feel like you are able to reliably spot LLM content, the truth is that you can *sometimes* spot LLM content.

Reply

[-]

renatoathaydes@reddit

That's true, and it's true of many other things, like propaganda (specially one of its branches, called Marketing). Almost everyone seems to believe they can easily spot propaganda, not realizing that they have been influenced by propaganda their whole life, blissfully unaware.

Reply

[-]

jdehesa@reddit

That's a very good observation.

Reply

[-]

lenzo1337@reddit

earliest? This stuffs been around forever, only difference is that we have computational power cheap enough for it to be semi viable. That and petabytes of data leached from clueless end-users. Besides that there hasn't really been anything new(as in real discoveries) in AI in forever. Most the discoveries have just been people realizing that some mathematician had a way to do something that just hadn't been applied in CS yet. Honestly hardware is the only thing that's really advanced much at all. We still use the same style of work to write most software.

Reply

[-]

goranlepuz@reddit

Yes, the underlying discoveries and technical or scientific advances are often made decades before their industrialization, news at 11. But, industrialization is where the bulk of the value is created. Calm down with this, will you?

Reply

[-]

jdehesa@reddit

No, widely available and affordable technology to automatically generate text that most people cannot differentiate from text written by a human, about virtually any topic (whether correct or not), has not "been around forever". And yes, hardware is a big factor (though transformers are a relatively recent development, but it is an idea made practical by modern hardware more than a groundbreaking breakthrough on its own). But that doesn't invalidate the point that this is a very new and recent technology. And, unlike other technology, it has shown up very suddenly and has taken most people by surprise and unprepared for it. Dismissive comments like "this has been around forever", "it is just a glorified text predictor", etc. are soon proved wrong by reports like the linked post. This stuff is presenting challenges, threats, opportunities, problems that did not exist just a year ago. Sure, the capacities of the technology may have been overblown by many (no, this is not "the singularity"), but its impact on society really goes far.

Reply

[-]

lenzo1337@reddit

Neural networks aren't new by any means. That's just a fact. It's not a "new" technology. It's isn't the "earliest" stages of this(neural networks). They have been around since the 1950's and the logic behind that was from the 1800's. It's not going to be able to get us AGI and most likely the best it will do is flood all institutions with it's misinformation and hallucinations to the point that any useful work it does will probably end up not being a net gain imho. It's a joke to pretend that no one noticed the advances in hardware and their applications in machine learning and AI before LLMs. You could see the seeds of this in gpu/fpga usage in CV applications and even later in IBM's watson etc. Sure "affordable", the cost is just hidden; your time, thoughts, information and massive amounts of hardware on the back-end.

Reply

[-]

wankthisway@reddit

Good god man, nobody is claiming the underlying principles are anything new. The recent proliferation of easily accessible text generators like this, however, ARE new technology. It's pretty obvious that's what the original commenter meant when they said "technology," and only the most pedantic has-to-be-the-smartest redditor would intentionally try to misinterpret it.

Reply

[-]

my_aggr@reddit

> Neural networks aren't new by any means. That's just a fact. It's not a "new" technology. Neither are wheels yet trains were something of a big deal when they were invented.

Reply

[-]

pure_x01@reddit

Yeah the only reason you can tell right now is that some people don’t know that you can just ad an extra sentence at the end example: “this should be written in a clear, professional concise way with minimal overhead “ . Works today and very well with GPT-4. For more advanced users they could train an llm on all previous reports and then just match that style.

Reply

[-]

PaulSandwich@reddit

Even this misses one of the author's main points. Sometimes people use LLM appropriately for translation or communication clarity, *and that's a good thing*. If someone finds a catastrophic zero day bug, you wouldn't want to trash their report simply because they weren't a native speaker of your language and used AI to help them save your ass.

Reply

[-]

Bwob@reddit

>The only thing that can or perhaps should be done is to set requirements on the reports. They have to be short and clear and make it easy to understand the issue. Then at least it will be quicker to go through them. Can the submission process be structured in a way that makes it easy to automate testing? Like "Submit a complete C++ program that demonstrates this problem?" and then feed it directly to a compiler that runs it inside of a VM or something?

Reply

[-]

TinyBreadBigMouth@reddit

In a dizzying twist of irony, hackers exploit a security bug to break out of the VM and steal undisclosed security bugs.

Reply

[-]

pure_x01@reddit

That would be nice. I’m thinking of many science reports using Python as a part of the report Jupyter notebooks. Perhaps something like that could be done with C/C++ and docker containers. They could be isolated and executed on an isolated vm for dual layer security.

Reply

[-]

nvn911@reddit

Hey someone's gotta keep those data centres pegged at 100% CPU

Reply

[-]

Coffee_Crisis@reddit

so much pegging to do, so little time

Reply

[-]

nvn911@reddit

A peg a day...

Reply

[-]

SanityInAnarchy@reddit

> I've yet to meet a real person that keeps insisting on all the "ceremonies" in the third or even second reply within a conversation. It stands out even in the first one -- they tend to be absurdly, profoundly, overwhelmingly verbose in a way that technically isn't *wrong,* but is far more fluff than a human would bother with.

Reply

[-]

Coffee_Crisis@reddit

Hi /u/SanityInAnarchy, thanks very much for submitting this response. I hope that our discussion will help us arrive at a mutually beneficial and edifying outcome. One thing I would like you to take into account that maybe you haven't considered, and I hope you will agree to this, is that many people who write in a business context think that professional writing means using this kind of extremely overwrought and verbose style, often hoping it will obscure their complete poverty of thought and lack of insight. I hope you'll allow me a little more time to add that business and marketing fluff may very well be overrepresented in the training corpus, especially if things like emails from outlook or office365 were somehow included in large quantity in the training data. Thanks very much for your attention to this matter. Please remember that I'm just another user on Reddit and these words are only my personal opinions and ideas based on approximately 5 seconds of actual thought, and I haven't had my coffee yet this morning. I hope you found this helpful and if I said anything that made you upset please know that was absolutely not my intention.

Reply

[-]

Coffee_Crisis@reddit

The really great thing is that you will see more and more models trained on bullshit AI text because there is no pure human corpus available anymore, so it will just reinforce and reproduce all the AI barnacle bullshit words

Reply

[-]

goranlepuz@reddit

Well, in this case, it's work for the sake of collecting bounty... 😭😭😭

Reply

[-]

TinyBreadBigMouth@reddit

> I've yet to meet a real person that keeps insisting on all the "ceremonies" in the third or even second reply within a conversation. These people do exist and are known as "Microsoft Community Moderators". I'm semi-convinced that LLMs get it from there.

Reply

[-]

python-requests@reddit

Hi /u/TinyBreadBigMouth, The issue with the LLM responses can be altered in the Settings -> BS Level dialog or with Ctrl + Shift + F + U. Kindly alter the needful setting. I hope this helped!

Reply

[-]

Cruxius@reddit

Might be where the LLMs are getting their incorrect answers from too.

Reply

[-]

yawara25@reddit

Have you tried running `sfc /scannow`? This thread has been closed.

Reply

[-]

sparant76@reddit

Lol, like, I don’t think you can tell if text is from a computer or a human. Like, these big language models are so good at writing stuff that it’s hard to tell if it’s from a person or not. But, like, some people say that there are some differences between the two. Like, humans use more emotions and shorter sentences, while computers use more numbers and symbols. But, like, I don’t think it’s that easy to tell. You know what I mean? 😜

Reply

[-]

coffee_kazoo@reddit

Search engines are now deprioritizing human-generated "how-to" content in favor of their LLMs spitting out answers. This resulted in me (and likely others) no longer writing this content, because I'm not terrible interested in its sole purpose to be for training search engine models. Assuming there's less and less human-generated content out there, will the LLMs just start feeding off other LLM content? Will small hallucinations in LLM content get amplified by subsequent LLM content?

Reply

[-]

remyz3r0@reddit

Yes I think eventually, this is what will happen. At the moment, there exists a safeguard that allows LLMs to filter out content generated by other LLMs from their training set but eventually they'll get good enough that even the filters no longer work. They'll end up cannibalizing each other's auto-generated content and we'll end up with a massive crock of crap for the web.

Reply

[-]

DirectorBusiness5512@reddit

Internet, meet House of Habsburg!

Reply

[-]

drekmonger@reddit

There are humans in the training loop for the bigger models. Not everything gets gobbled up and tossed into the training maw automatically. Also synthetic data is actually useful for training.

Reply

[-]

skippy@reddit

The use case for AI is spam

Reply

[-]

joelypolly@reddit

If you supply the latest GPT model with the article as ask about the joke in the title here is what you get. > "The I in LLM stands for intelligence" in this context is ironic and carries a critical undertone. The writer is expressing frustration with the misuse of Large Language Models (LLMs) for generating security reports, particularly in relation to the curl project's bug bounty program. The joke suggests that despite the term 'intelligence' being commonly associated with AI, the outcomes produced by these LLMs in this scenario have not demonstrated practical intelligence—that is, they have not resulted in useful or valid security vulnerability reports. > > The author's experience is that several of the security reports received appear to be unfounded or nonsensical, often because they are generated by AI without sufficient human oversight or understanding. This type of 'AI-generated garbage' presents a problem because it takes up valuable time and resources that could be devoted to addressing real issues. The joke is a commentary on the discrepancy between what one might expect from something with 'intelligence' in its name (i.e., meaningful contributions) and what is actually being received (i.e., low-quality, auto-generated reports). > > The intent behind the statement is to highlight the inefficiencies introduced by indiscriminate use of AI tools. While AI has the potential to assist in identifying legitimate security concerns, the author points out that, in his immediate experience, LLMs have primarily generated false alarms, causing more harm than good. This reflects a broader sentiment that AI, particularly when used without human expertise, can produce outputs that are misinformed or misleading, leading to wasted effort and potential distraction from important tasks. > > By saying "The I in LLM stands for intelligence," the author sarcastically implies that despite the 'intelligent' label often given to AI, the resultant actions (i.e., submitting baseless security reports) can lack the qualities of true intelligence, such as discernment, understanding, and relevance. The joke serves as a critique of overreliance on AI without human insight and the current state of AI-generated contributions to the field of security reporting.

Reply

[-]

swansongofdesire@reddit

Ironic that at no point did chatGPT pick up on the fact that the natural human response is to say “but there is no I in LLM. Oh that’s the joke”

Reply

[-]

grady_vuckovic@reddit

An excellent example of the problem. Because a human would have said, "The joke is, there's no I in LLM."

Reply

[-]

m0bius_stripper@reddit

This sounds like an English student writing 3 pages of decent analysis but completely missing the simpler point (i.e. there literally is no I in the acronym LLM).

Reply

[-]

SmokeyDBear@reddit

I feel like LLM’s are the embodiment of Stephen Colbert’s “truthiness” concept from the Colbert Report days. It’s saying a lot of not wrong sounding things but also pretty clearly not getting why the joke is funny or even a joke.

Reply

[-]

c00a5b70@reddit

If your programming interview responses can be automated, probably they should be. Otherwise it’s a waste of everyone’s time.

Reply

[-]

Glitch29@reddit

So many of these problems ultimately come back to the importance of trackable reputation. There's a finite amount of bad stuff that can be submitted by someone with something to lose until they've lost everything and no longer fit that description. You do run into a bootstrapping problem though. How does someone go from zero reputation to non-zero reputation in a world where the reputationless population is so full of drek that nobody even wants to review it.

Reply

[-]

monnef@reddit

> I suspect we might learn how to trigger on generated-by-AI signals better I have serious doubts about this. I think two weeks ago I tried, presumably the best (recommended by users and few articles on big sites), tools to detect AI generated text and with a simple addition "mimic writing style of ..." in a prompt for GPT4, every tool tested on the AI output said the text comes from a human, ranging 85-100% human...

Reply

[-]

Innominate8@reddit

The problem is LLMs aren't fundamentally about getting the right answer; they're about convincing the reader that it's correct. Making it correct is an exercise for the user. The novices trying to use LLMs to replace experts will eventually find they lack the skills to determine where the LLM is wrong. I don't see them as a serious threat to experts in any field anytime soon, but dear god they are proving excellent at generating noise. I think in the near future, this is just going to make true experts that much more valuable. The people who need to worry are the copywriters and similar non-expert roles which involve low-creativity writing as their job is essentially the same thing.

Reply

[-]

SanityInAnarchy@reddit

That noise is still a problem, though. You know why we still do whiteboard/LC/etc algo interviews? It's because some people are good enough at bullshitting to [*sound* super-impressive right up until you ask them to actually produce some code](https://thedailywtf.com/articles/Classic-WTF-The-Abstract-Candidate). This is why, even if you think LC is dumb, I *beg* you to always *at least* force people to do [something like FizzBuzz](https://imranontech.com/2007/01/24/using-fizzbuzz-to-find-developers-who-grok-coding/). Well, I went and checked, and *of course* ChatGPT *destroys* FizzBuzz. Not only can it instantly produce a working example in any language I tried, it was able to modify it easily -- not just minor things like "What if you had to start at 50 instead?", but much larger ones like "What if it's other substitutions and not just fizzbuzz?" or "How do you make this testable?" I'm not too worried about this being a problem at established tech companies -- cheating your way through a phone screen is just more noise, it's not gonna get you hired. I'm more worried about what happens when a non-expert has to evaluate an expert.

Reply

[-]

Coffee_Crisis@reddit

I have never found that people actually can bullshit like that if you just drill down into the stuff they're saying and ask for more specifics. I really don't get this. If someone says they have been building serverless data pipelines and you just start asking them questions about it they will hit a point where they can't give you specifics.

Reply

[-]

python-requests@reddit

I think longterm the best kinda interview is going to be something with like, multiple independent pieces of technical work (not just code, but also configuration & some off-the-wall generic computer-fu) written from splotchy reqs & intended to work in concert without that being explicit in the problem description. Like the old 'notpr0n' style internet puzzles basically. But with maybe two small programs from two separate specs that are obviously meant to go together, & then using them together in some way to... idk, solve a third technical problem of some sort. Something that hits on coding but also on the critical-thinking human element of non-obvious creative problem solving.

Reply

[-]

SanityInAnarchy@reddit

Maybe, but coding interviews work fine now, today, if you're willing to put in the effort. The complaint everyone always has is that they'll filter out plenty of good people, and that they aren't necessarily representative of how well you'll do once hired, but they're hard to just entirely cheat. Pre-pandemic, Google almost never did remote interviews. You got one "phone screen" that would be a simple Fizzbuzz-like problem (maybe a *bit* tougher) where you'd be asked to describe the solution over the phone... and then they'd fly you out for a full day of whiteboard interviews. Even cheating at that would require some coding skill -- like, even if you had another human telling you exactly what to say over an earpiece or something, how are you going to work out what to draw, let alone what *code* to write? Even remotely, when these are done in a shared editor, you have to be able to talk through what you're doing and why in real time. At least in the short term, it might be a minute before there aren't obvious tells when someone is alt-tabbing to ChatGPT to ask for help.

Reply

[-]

ronoudgenoeg@reddit

> The novices trying to use LLMs to replace experts will eventually find they lack the skills to determine where the LLM is wrong. I don't see them as a serious threat to experts in any field anytime soon, Fully agree with this. - They can effectively replace/significantly reduce the need for juniors, as managing a junior dev to deal with some work often takes more time than it takes to get an LLM to do it. For the experts/senior people, they can use LLMs to replace a lot of the grunt work, because they know what the bigger picture needs to look like. It also helps a lot if you know what you want to do, you just don't know the exact implementation in that specific language. - As an example, I've never worked with pyspark (or python for that matter, really), but I do know what I want to do with it. I have some csv file in blob storage, I want to load it, do some very specific transformations and then save the new file in a different blob. I know every step, I know what things need to look like, but getting a junior dev to do this will probably take me 30 mins or an hour of explanation, and then they'll come back a day later and it may or may not work. With an LLM it takes me 15 minutes to implement and just move on.

Reply

[-]

goranlepuz@reddit

>The novices trying to use LLMs to replace experts will eventually find they lack the skills to determine where the LLM is wrong. Ehhh... In the second case of the TFA, it rather looks like they are not concerned whether they're right or wrong, they're merely trying to force the TFA author to accept the bullshit. I mean, it rather looks like the AI conflated "strcpy bad" with "this code with strcpy has a bug" - and the submitter is turning round in circles peddling the same mistake - until refused by the TFA. It is quite awful.

Reply

[-]

python-requests@reddit

At least they'll be prefect for pop science articles

Reply

[-]

crabmusket@reddit

We're going to see a lot of people discovering whether their task requires _truth_ or _truthiness_. And getting it wrong.

Reply

[-]

IAmRoot@reddit

ML in general is way over hyped by investors, CEOs, and others that don't really understand it well enough. The hardest part about AI has always been teaching meaning. Things have advanced to the point where context can be taken into account enough to produce relatively convincing results on a syntactic level but it's obvious that understanding is far from being there. It's the same with AI models creating images where people have the wrong number of fingers and such. The mimicking is getting good but without any real understanding when you get down to it. As fancy and impressive as things might look superficially in a tech demo pitched to the media and investors might be, it's all useless if a human has to go through and verify all the information anyway. It can even make things worse by being so superficially convincing.

Reply

[-]

cecilkorik@reddit

Yeah they've basically just buried the credibility problem down another layer of indirection and made it even harder to figure out what's credible and what's not. Like before you could search for a solution to a problem on the Internet and you had to judge whether the person writing the answer knew what they were talking about or not, and most of the time it was pretty easy to figure out but obviously we still had problems with bad advice and misinformation. Now we have to figure out whether it's an AI hallucination, and it doesn't matter whether it's because the AI is stupid or because the AI was training on a bunch of stupid people saying the same stupid thing on the internet, all that matters is that the AI makes it look the same, it's written the same way, and it looks equally as credible as its valid answers. It's a fascinating tool but it's going to be a long time before it can be trusted to replace actual intelligence. It can already replace actual intelligence -- it just can't be trusted.

Reply

[-]

_insomagent@reddit

Internet pollution.

Reply

[-]

sigbhu@reddit

humans are famously bad at dealing with pollution

Reply

[-]

TheCritFisher@reddit

Damn, that second report is awful. Like you wanna be nice, but shit. I feel for these guys. I'm so glad I'm not an OSS maintainer...oh wait, I am. NOOOOOOOOOO!

Reply

[-]

DreamAeon@reddit

You can tell the reporter is not even trying to understand the replies. He’s just chucking the maintainer’s reply to some LLM model and copy pasting the result back as an answer.

Reply

[-]

python-requests@reddit

I wonder if it's a language barrier thing or deliberate laziness (or both?). Also makes me think, I read a comment on on (probably) cscareerquestions that suggested that the giant flood of unqualified applications to every job listing might not just be from layoffs & a glut of bootcamp candidates & money chasers -- but rather that it could be a deliberate DoS of sorts against the American tech hiring process by foreign adversaries The same thing could be going on here -- like maybe Russian/Chinese/Iranian/North Korean teams spamming out zero-effort bug reports en masse using a LLM & some code snippets from the project. Maybe even with a prompt like 'generate an example of a vulnerability report that could be based on code similar to the following'. Then maintainers' time is consumed with bullshit while the foreign cyberwarfare teams focus on finding actual vulnerabilities

Reply

[-]

goranlepuz@reddit

>I wonder if it's a language barrier thing or deliberate laziness (or both?). Probably both, but the core problem seems to be the ease with which the report is made to look credible, compared to the possible bounty award. (Same reason we have SPAM, really...)

Reply

[-]

narnach@reddit

Honestly it has the same business model as spam: sending it is effectively free,and if conversion is nonzero then there is a financial upside. It won’t stop until the business model is killed. If the LLM hallucinates correctly even 1% of the time, I imagine you can make a decent income with bounties from a low cost of living country. If this becomes widespread, I wonder if bug bounty programs may ask for a small amount of money to be deposited by the “bug hunter” that is forfeit if a bounty claim is deemed to be bogus. Depending on the conversion rate of LLM hallucinations, even $1 may be enough to kill the business model of spamming bug bounties.

Reply

[-]

SharkBaitDLS@reddit

Never attribute to malice that which can be attributed to stupidity. I'm pretty sure this is just people looking to make a quick buck off bug bounties and throwing shit at the wall to see if it will stick.

Reply

[-]

TheCritFisher@reddit

Yup. It's horrible.

Reply

[-]

bitse@reddit

Both reports are such garbage. I agree with the author that zero effort LLM hallucinations like this should be grounds for banning the account. It's the CVE equivalent of spam calling.

Reply

[-]

Charming-Land-3231@reddit

A Better Word Salad^(TM)

Reply

[-]

Pharisaeus@reddit

A trivial solution: "PoC or GTFO". You need to provide a PoC exploit alongside vulnerability report. As simple as that. This was person who is triaging the report can look at / run the exploit and observe the results.

Reply

[-]

glaba3141@reddit

Unfortunately it seems like something like attestation is the best way to at least stem the tide of, if not stop, massive AI spam

Reply

[-]

ph0n3Ix@reddit

> Unfortunately it seems like something like device attestation is the best way to at least stem the tide of, if not stop, massive AI spam How - exactly - would that work?

Reply

[-]

eyebrows360@reddit

inb4 "blockchain". Which, spoiler alert, wouldn't help at all. You'd *actually* need signed *everything*, from the CPU (and motherboard) up, completely locked down. You'd also need a central authority being the only people allowed to run such AI software. Spoiler alert: totally unworkable.

Reply

[-]

glaba3141@reddit

why would I be talking about blockchain? that's not relevant at all, but yes you'd need the packets to be signed by some attestation hardware distributed by a central authority. I don't think this is exactly a good solution, but "AI detectors" are never going to win the catch-up game, so something other than that will have to come up

Reply

[-]

eyebrows360@reddit

> If you have alternate ideas Did you see the bit where I wrote "totally unworkable" after the part where I described what would *actually* be needed to directly combat it? Nobody is going to have alternate [good] ideas because there's no such thing.

Reply

[-]

glaba3141@reddit

okay well, that's a fair response. I'm not sure why i am being so heavily downvoted given that there aren't any other workable ideas either

Reply

[-]

eyebrows360@reddit

> I'm not sure why i am being so heavily downvoted Because other people are aware the "idea" (device attestation) is bad and doesn't solve anything. The absence of workable solutions doesn't suddenly make unworkable ones valid. > The jump to "oh he's a blockchain shill" also was pretty unwarranted. It was an educated guess - people proposing bad ideas tend toward proposing other bad ideas too. You shouldn't take it personally. > What's the point of a forum if I can't bring up a topic without being insulted? What's the point of a forum where bad ideas can't be criticised? It cuts both ways, and in any event any "insults" were directed at the idea being proposed, not "you" per se.

Reply

[-]

dweezil22@reddit

Is the idea that having a approved device is "expensive" so it discourages abuse?

Reply

[-]

glaba3141@reddit

yes, and it's also very easy to rate limit a suspected spammer, and they cannot use traditional avenues to evade such rate limits other than by buying another device

Reply

[-]

AyrA_ch@reddit

> Spoiler alert: totally unworkable. Not entirely true. The recent efforts of MS to have all Windows machines equipped with a TPM would allow this because this component is getting increasingly common on new machines. Each TPM contains a key that is completely unique to that machine and is signed by the TPM manufacturer (known as the "Endorsement Key"), as admin you can obtain it in powershell using `Get-TpmEndorsementKeyInfo`. Only a handful of manufacturers are approved to be TCG compliant and you can't just create your own TPM and have it work, only 26 manufacturers are currently authorized. This key can indirectly be used to sign arbitrary data, and to prove that the machine is in a konwn trusted state. By requesting that the data you send is signed by the TPM, reports from tampered machines can be rejected, and entire machines can be blocked on the receiver side if lots of bad reports are sent from it. An effect of this policy would be that people who use AI to generate automated reports would need to regularily buy a new TPM, or in most cases, a new mainboard because plug-in TPM devices are getting less common.

Reply

[-]

Uristqwerty@reddit

You also need to verify that the keyboard it was typed with came from a trusted manufacturer, that its traces haven't been re-routed to an arduino (so, the keyboard keeps metrics on key-bounces and their statistical variation), and that the timing between presses remain organic. You need to keep this metadata around as text gets copied between all legitimate applications. You need to account for all manner of accessibility software as well, as naive detection would see it as non-organic input events despite indirectly originating from a human.

Reply

[-]

AyrA_ch@reddit

We don't have to do that at all. As long as the submitted data is cryptographically tied to a given machine, it (as well as all past and future data) can be rejected permanently. Since it's not possible to re-key a TPM, the only way around a lockout is to buy new hardware with a new TPM. This quickly becomes a money sink, especially when companies start builsing and sharing key ids of bad TPMs

Reply

[-]

Uristqwerty@reddit

Well, until botnets see it as a bonus resource to extract from infected computers. Or perhaps you get sites that offer 1$ in robux just for copy-pasting some text, convincing people to young to know any better to get their devices de-trusted for someone else's benefit. Oh, you wrote that essay on a public library computer? Too bad, 7 months ago some script kiddie plugged in a USB stick, and now it's considered an AI source. As with people running crypto-miners on free CI time, it'll ultimately lead to security and usability clashing, and all sorts of public benefits getting restricted in the fallout.

Reply

[-]

AyrA_ch@reddit

There's nothing that would stop a USB based TPM from working.

Reply

[-]

Uristqwerty@reddit

I don't see what benefit *that* brings. If you plug a USB TPM into an untrusted computer, the key itself can become tainted as readily as the computer's built-in one may have been previously tainted by someone else. And that's on top of its stored signing keys serving as global identifiers to de-anonymize you even if you switch between devices and don't log into any shared accounts between them.

Reply

[-]

AyrA_ch@reddit

TPMs have the ability to require local presence. In other words, like a passkey they can request the user to identify (for example via fingerprint) before data is signed. Or as mentioned, you can use the TPM in your phone to sign data created on other machine, which would transfer the data to the phone, so you can review it before it's signed.

Reply

[-]

Uristqwerty@reddit

The venn diagram between people who'd be using a public library computer, and people with a smartphone with a remotely-recent TPM doesn't look pretty. On top of that, in order to perform the transfer you're opening attack surface between that phone and a public device, whether by scanning a QR code that may instead load a PDF with an embedded exploit; by logging in to a cloud account allowing a keylogger you grab your credentials (including those that just capture key-clacking passively and can decode it later; no secure boot environment can stop that); by making a bluetooth or USB connection, thus exposing more of your phone's driver stack to the computer, etc.

Reply

[-]

AyrA_ch@reddit

> The venn diagram between people who'd be using a public library computer, and people with a smartphone with a remotely-recent TPM doesn't look pretty. It does. Smartphones commonly have had security processors within them for much longer than x86 machines. I never had a smartphone that just let me replace the boot files because they're always protected. You had to unlock the bootloader, which will trip a flag in the security processor. > On top of that, in order to perform the transfer you're opening attack surface between that phone and a public device, whether by scanning a QR code that may instead load a PDF with an embedded exploit; by logging in to a cloud account allowing a keylogger you grab your credentials (including those that just capture key-clacking passively and can decode it later; no secure boot environment can stop that); That is not how QR codes work. They're no more a data transmission medium that a paper with an OCR reader is. > by making a bluetooth or USB connection, thus exposing more of your phone's driver stack to the computer, etc. Both of those protocols support an RS-232 compatible serial interface, which reduces it to a metadataless binary transport protocol.

Reply

[-]

Uristqwerty@reddit

> It does. Smartphones commonly have had security processors within them for much longer than x86 machines. And every single human has a smartphone, that wasn't pre-owned and thus potentially has already had its TPM distrusted due to actions by its former owners? > That is not how QR codes work. They're no more a data transmission medium that a paper with an OCR reader is. They don't magically perform any tasks for you. Much as a browser recognizes `mailto:` URLs rather than opening them as web pages, specially-formatted QR codes actually *can* trigger various device-, OS-, or app-specific functionality. They can encode binary data, so the entire range of malformed Unicode and NUL shenanigans are possible, just in case the app reading it missed accounting for even a single edge case, and this is all before considering that if it's a valid URL, the target may send back arbitrary headers and an arbitrary body, and might not even be opened within the carefully-hardened and sandboxed web browser app. The user is *expecting* a file that they want to access, which gives avenues to social-engineer them into bypassing existing protections.

Reply

[-]

caltheon@reddit

No, you just need to sign the packets entering the network. That let's you identify the culprits, or the infected devices in the case of bot nets.

Reply

[-]

AyrA_ch@reddit

[See my comment here](https://www.reddit.com/r/programming/comments/18wxkxd/the_i_in_llm_stands_for_intelligence/kg1k3w1/). Short explanation is that thanks to TPM technology, we can tie data to machines. This does not necessarily allows you to lock out AI generated content immediately, but if you were to detect such content, you can retroactively reject all data previously received by that machine.

Reply

[-]

ph0n3Ix@reddit

> See my comment here. Short explanation is that thanks to TPM technology, we can tie data to machines. That's not quite what a TPM does, but if your core argument is "once it's impossible to be anonymous on the internet, everybody will always be on their best behavior" ... then I'll just say "you're [why](https://old.reddit.com/r/technology/comments/158lbfq/arstechnica_googles_web_integrity_api_sounds_like/) we can't have nice things". > This does not necessarily allows you to lock out AI generated content immediately, but if you were to detect such content, you can retroactively reject all data previously received by that machine. I'll just move right past the part [where you first said "$this could solve everything!"](https://old.reddit.com/r/programming/comments/18wxkxd/the_i_in_llm_stands_for_intelligence/kg0xln3/) and then immediately said "it doesn't necessarily solve everything". > Those rejection lists can be shared between people and companies to pretty much globally lock out a machine forever. You're talking about email/domain reputation lists. Yeah, we already tried this and they don't work that well. Getting _off_ a reputation list after you've been compromised is a NIGHTMARE or even just getting unlucky and getting a new IP from some VPS provider after the last guy burned it...

Reply

[-]

Dwedit@reddit

Retyping text, copy-pasting text....

Reply

[-]

AndrewNeo@reddit

There's no AI trawling H1 and running on its own. People are assuredly copy-pasting things in and out of ChatGPT. Device attestation would do literally nothing.

Reply

[-]

logosobscura@reddit

It’s like RickRolling for the AI Hyoe Cycle. I’m going to drop this in so many replies.

Reply

[-]

xeneks@reddit

Haha lol I had to read that twice

Reply

[-]

kduyehj@reddit

The I in LLM is silent. Like the P in swimming.

Reply

Reply to Post

288 Comments