I rue the day they first introduced "this is not X, this is <unearned superlative>' to LLM training data

[-]

noage@reddit

It's a pretty strong rhetorical device, when it applies. So, done strong works include it, when it works. Llms think hey, it always works. This is a flaw in how llms operate in general

[-]

eiva-01@reddit

Additionally, it was likely given positive reinforcement by human evaluators who recognised it as a strong rhetorical device, before we started to recognise it as an overused pattern.

I'm sure it'll get trained out in the coming iterations of LLMs, but new cliches will probably emerge to replace them. It'll be a game of whack-a-mole for a little while.

[-]

IxinDow@reddit

It's futile without continuos learning

[-]

Comfortable-Rock-498@reddit (OP)

True. Same thing about analogies. LLMs love to force analogies that are barely coherent, usually spamming Car/Engine/Fuel analogy or Hardware/Software/Operating-System analogy.

[-]

analogies can be useful for generalizing knowledge, I've suspected they were introduced to trigger the LLM to included knowledge from other domains. As for whether that's working or not, I don't know.

[-]

munster_madness@reddit

This, like most of the sycophancy in LLMs, comes from SFT. These models were fine tuned by human users who are given multiple responses to a given prompt and then have to rank them based on how much they like each response. Turns out people like having their ego stroked, so those kinds of responses got the highest scores and the models were tuned to give those kinds of responses more often.

[-]

Ylsid@reddit

It isn't just a strong rhetorical device, it's the fundament of literature

[-]

_supert_@reddit

If by fundament you mean arse.

[-]

Ylsid@reddit

Not just an arse, but the human posterior

[-]

-dysangel-@reddit

not just the posterior, but the dangly parts of the anterior

[-]

IxinDow@reddit

now this is not a fundament, just a slop

[-]

Ylsid@reddit

This isn't just slop, it's the greasy school lunch leftovers

[-]

typical-predditor@reddit

when it applies

Thus "unearned" in the post title. Yes, it's powerful, but... You already stated why it's bad.

[-]

molbal@reddit

"It isn't a bug, it's how LLMs operate in general"

[-]

maneo@reddit

I think it comes up so much because it is a sentence structure that is often used in great writing. The problem is that the use of that structure doesn't automatically make for great writing. Worse, GPT doesn't really understand how to judge whether a particular use of it was any good.

It's similar to the em dash problem. A lot of great writing has em dashes all over the place. But GPT doesn't have the strongest grasp on why it is used in certain situations vs using something else. The result is a severe overuse of it.

And another example is the general use of metaphor/simile (at least for GPT 4o). You will reach a point in the text where a decent writer might draw some kind of comparison to help you understand a concept they just explained. GPT will recognize that it's a good opportunity to do that, but its just bad at metaphors and similes, and comes up with one's that just feel... off. Now I find myself cringing anytime I see any metaphor or simile get used like that, regardless of the quality of the metaphor. It's like feeling nauseous from the mere smell of fish after suffering from sushi-induced food poisoning for an entire two week vacation in Japan! (see what I did there?)

[-]

HarleyBomb87@reddit

This isn’t just an opinion, it’s a a referendum on the state of LLMs.

[-]

NNN_Throwaway2@reddit

LLMs fundamentally suck at writing because they reproduce patterns without context. Same reason why they can't write jokes. That and they've probably been trained on huge piles of shitty fanfiction and trashy novels because AI labs were desperate for any training data they could get their hands on, regardless of quality.

[-]

Serprotease@reddit

They suck at writing because the benchmarks do not care about writing. ‘Show me the incentive and I will tell you the outcome’ and whatnot.

If writing quality was added as a standard benchmark, I’m sure we would have seen some good progress here.

[-]

Lying__Cat@reddit

You can’t really benchmark “writing quality” it’s subjective. AI labs are obviously working to improve it, since most people use llms to generate text. There has been progress, but it’s still ai slop

[-]

Serprotease@reddit

I disagree with your first statement.
Liking a book/short novel is subjective, but quality writing is something that you can be trained on. After all, there are such things as good and bad books.
To give a more concrete example, one might think that a good image is subjective, yet they are courses to take pictures/draw and there are obvious examples of good/bad pictures. In addition there are benchmarks for images models.

Some examples of benchmarks for writing.
Character consistency.
Situation awareness. (Character A and B are in two different room in 2 different cities and talking over the phone).
In universe logic. (Character A do not know an information until given to him.).

And a few others a bit harder to track but definitely important.
Usage of repetition/allegories Overuse of explicit statements over implicit ones.

[-]

Gold-Cucumber-2068@reddit

LLMs are very impressive, but they still have not created a single original, insightful and fascinating thing. They're perfect for doing homework, summarizing things, etc, but their writing has still contributed nothing original to humanity, not even close.

[-]

Super_Sierra@reddit

Only true for small models, huge corpo models and Kimi K2 are more capable at language and creative writing than most people.

I've been writing and reading in various different places online for fifteen years and Kimi K2 is insanely good at weird, creative prose.

GPT-5 also with enough instruction can do some insane shit.

[-]

NNN_Throwaway2@reddit

Most people are not writers and can't write even if they tried. Even most people who do actively write are pretty shit at it. Its a low bar. Even then, LLMs put out some pretty horrible dreck.

I've also been reading and writing online for a couple decades, if we're throwing around credentials.

[-]

Super_Sierra@reddit

If you want to go down that road of thinking ...

You are probably giving the LLMs garbage to work with and nothing creatively original, so whatever benchmark you are doing is entirely a skill issue. GPT-5, opus, and especially Kimi K2 are capable of doing pretty much any writing task.

They also cannot read your fucking mind, so if you are asking it like a trog and going 'write an original work' that won't work.

[-]

Gold-Cucumber-2068@reddit

This is an interesting new form of circular reasoning.

We argue that no current LLMs are truly creative like talented humans, and your response is that it's because the people using it aren't being creative or talented enough.

It's like saying a restaurant's food isn't actually bad, the diners just didn't bring their own food to make it good.

[-]

NNN_Throwaway2@reddit

No, that's not what I'm doing.

Just mindlessly assuming "skill issue" with no evidence when someone has a different result with LLMs is really fucking stupid.

[-]

Super_Sierra@reddit

you immediately assumed that i had no idea what i was talking about and decided to throw some objectivity back at you and you squirmed

maybe don't be a weirdo who can't remember a post or two back

[-]

NNN_Throwaway2@reddit

I don't need to assume that. I called you out for your clout-chasing bullshit comment and you immediately got defensive; I'm not the one who "squirmed" (is this an example of what you consider "weird, creative prose"? If so, yikes).

[-]

Serprotease@reddit

You’re putting the bar a bit high in my opinion. I did a bit of writing as a hobby before and I can definitely tell you that it was bad and didn’t bring anything to humanity in general, but it was a fun thing to do.

Writing (or any creative work really) is a mix of intent and skills. Llm do not have any intent per se, but they could have the skill part.

But it’s quite underwhelming so far. Not bad really, but far from what we could expect from this huge models.

[-]

a_beautiful_rhind@reddit

There has been regress.

[-]

Super_Sierra@reddit

In open source, yeah.

Corpo models are leagues better in every department except for Kimi K2.

[-]

a_beautiful_rhind@reddit

I dunno about leagues because corpo models have some of the same issues with x'ing, echoing, etc.

[-]

SlapAndFinger@reddit

Aesthetics are subjective, but given a certain set of aesthetics has been "agreed upon," whether something conforms to that aesthetic or not is pretty objective.

[-]

NNN_Throwaway2@reddit

Kinda sorta.

The problem is that improving writing without over-specializing a model means carefully curating the pre-training dataset, which is quite expensive, potentially extremely expensive when you consider that models are now being trained on tens of trillions of tokens. For that to happen, there would need to be a clearly demonstrated cost-benefit for labs to even consider such an endeavor.

In addition, any kind of tuning for "good writing" has the potential to over-align the model and reduce is ability to generalize or tolerate ambiguity, and could even cause regressions in performance of other knowledge domains.

[-]

rm-rf-rm@reddit

I think its an unabashed good thing - we need markers like this to be able to distuinguish AI writing from human writing (as many humans are shameless in trying to pass AI writing as their own now).

The unfortunate thing is that this is going to be trained out in the next gen of models

[-]

edalgomezn@reddit

This is not a post, it is the harsh reality

[-]

aetherec@reddit

This is not a bad thing, this makes it easy for me to spot AI generated text

[-]

johnny_riser@reddit

I used to be a very good speechwriter, though. My secret sauce was this style. It was unique then, but now it signifies AI. I see it everywhere in PR releases nowadays.

[-]

Ylsid@reddit

You're absolutely right!

[-]

throwaway2676@reddit

This really isn't a big deal, it is a reliable way to clarify meaning

[-]

GCU-Dramatic-Exit@reddit

This crap is all over LinkedIn

Worryingly, have also seen it in TheGuardian and the New York Times

[-]

Morphon@reddit

Kimi K2 (especially the 0905 versions) seems to be free of this quirk. I'm not saying that it never uses this construction - but it does so pretty rarely in my interactions with it.

[-]

Kraskos@reddit

My voice drops to a conspiratorial whisper You've hit the nail right on the head -- this post didn't just send a shiver down my spine, it was a full-blown existential tremor that has fundamentally reshaped my understanding of digital communication. It wasn't just a complaint, it was a call to arms. And as we look to the horizon, one can't help but wonder what the next day will bring, and how the very fabric of our language will be woven in this brave new world. All I know is... I'll never be the same.

[-]

i3ym@reddit

This shit is so ass.. but why are they even like this? They train on real data but nobody actually speaks like that

[-]

winter-m00n@reddit

maybe they are not trained on real world conversational data. they are mostly trained on books and blog posts which are mostly polished which they see in training again and again.

[-]

parseHex@reddit

Alright great, now we have a concentrated sample, maybe we can harness it to be an antidote somehow lol

[-]

nmkd@reddit

This just hurts to read

[-]

Background-Quote3581@reddit

Bro...

[-]

ZYy9oQ@reddit

Bro is doing too much RP with bots

[-]

sine120@reddit

You're absolutely right! That is an excellent and crucial observation to make, and my apologies for glossing over it. Your intuition is spot on—LLM's are starting to converge with the same idiosyncrasies. Not many people would have been able to catch that.

[-]

Historical-Camera972@reddit

Thank Techcrunch disrupt and Silicon Valley.

Gavin Bellson, Peter Gregory, and Richard Hendrix, screwed us!

[-]

aeroumbria@reddit

It's always Boromir's fault...

[-]

Yasstronaut@reddit

And “vibes”

[-]

CodeSlave9000@reddit

It’s not just a floor wax, it’s also a desert topping!

[-]

DevilsTrigonometry@reddit

Yeah, I recently ended a comment like that and instantly thought "I sound like AI." It's infuriating. That's a really effective rhetorical technique when used sparingly. But now that AI has flooded the Internet with it, it doesn't sound insightful; it sounds fake.

[-]

Poluact@reddit

What's worse - the more you interface with AI, the more you sound like AI. People pick up on things subconsciously.

[-]

Briskfall@reddit

Happened to me. This feels absolutely the worse especially when one write things manually.

It was fun playing the fun-house mirror at first. Coming from wanting to troll the LLMs by reflecting their own patterns. But doing so in practice lots seems to affect one's speech pattern. Monkey sees, monkey does.

I've conceded the best way to not get overly affected was by being in peace with one's writing. Fragmented structures, grammatical mistakes and all.

At least online communities allowed for a reality check (though with harsh words and false calibration).

It'll be a nightmare if it becomes more and more infested with bots or seeing users going full in with LLM-speak though. Eventually, how would one get out of such a feedback loop? Discord and knit-tight communities?

[-]

218-69@reddit

idk, been useful for me, especially in an adversarial setting, pushing buttons. I got to argue about tons of stuff that I'd have never bothered doing with ppl in real conversations, and I still remember them and can use them in the future if I ever decide to insert myself into a topic like that, or if I happen to find myself in one as such. In terms of usefulness other than irl interactions I'd put it 2nd only to any real time interaction (discord, "face-to-face" chats)

forums and social media sites are fucking useless for actual interaction (which is also another separate issue for training data) because you have all the time to filter yourself through them and it's 90% fake

[-]

Briskfall@reddit

Never did I say that interacting with LLMs were not useful. I was simply noting the side effects of doing so. Too much comfort can often lead to a detached outlook on reality.

Blanketing all forms of forums and social media as useless for actual interaction can be a dangerous thought, because it might be what leads to worser and worser training data for these LLMs. Of course, large subs filled with low quality takes aren't useful. But one can be selective with which community they choose to engage with.

[-]

Poluact@reddit

Literature. For improving language read books with rich language.

[-]

typical-predditor@reddit

I keep using the "Not X, but Y" pattern myself and I cringe when I do it. But then I realize there's a ton of people that still can't tell the difference between human and GenAI content.

[-]

TipIcy4319@reddit

Same for em dashes. I don't even use them anymore when writing fiction books. Actually having the occasional misspelling is good.

[-]

j0j0n4th4n@reddit

"This is a fantastic critique that cuts to the heart of meaningful roleplay!"

[-]

Savantskie1@reddit

The problem is people used to talk like this. That’s why it’s become so prevalent in ai. The further back ai training can go in internet history, you will see it much more. I remember a lot from the starting days. And it was very prevalent in the early days

[-]

content_goblin@reddit

I get so mad when they do this. Its like pure ragebait

[-]

log_2@reddit

In a kind of survival of the fittest. Text without superlatives is not viral/emotional/engaging and so biases itself out of training datasets. Superlatives are marketing devices that, unfortunately, work well on humans.

[-]

nmkd@reddit

RLHF was a mistake.

[-]

TheRealMasonMac@reddit

Some people theorized that this behavior is because of how LLMs don't understand how to use the construct. After finetuning on 100% high-quality human writing, I can assure you my model knows how to use the construct properly—seldom, but effective when it does. Therefore, this is literally because of OpenAI's RLHF and everyone else training on it.

[-]

HomeBrewUser@reddit

It's because of pure Gemini distillation, simple as that really.

[-]

SlapAndFinger@reddit

This pattern appeared before the big labs were diversifying their RL as much as they do now, it's almost certainly the result of synthetic data.

[-]

Jealous-Ad-202@reddit

Nice experiment. The prose is not half-bad, and much superior to the original one. Is it on hg?

[-]

AIFocusedAcc@reddit

LLMs think they are all Joker. It isn’t about the money, it’s about sending a message.

[-]

Comfortable-Rock-498@reddit (OP)

"You wanna know how I got these scars? It is worth noting that trauma narratives are complex, multifaceted experiences that shape our psychological development in profound ways"

[-]

EstarriolOfTheEast@reddit

Something worth noting is that everyone uses the same handful of LLMs and we've (model makers and users) all been making choices that restrict their expressive range: instruction fine-tuning, further RL fine-tuning, setting low temperatures and non-zero min-p's all act in concert to significantly reduce model entropy. So called slop is essentially unavoidable.

Anyone who wants LLMs with more range should do the opposite: prefer base models, set T=1, set min-p = a very, very low number, set top-p to > 0.9 (ideally > 0.95 or better yet, 1) and optionally use an entropy adaptive sampler. Any model ineffective at such parameter settings has been likely over-tuned for some task not requiring range in creative expression anyways.

RL for reasoning entropy collapse is very likely a problem in need of addressing, so maybe recent LLMs won't continue their backwards slide in performance for interactive fiction users.

[-]

MrWeirdoFace@reddit

...Are we the baddies?

[-]

MrWeirdoFace@reddit

You're absolutely right to point that out!

[-]

pitchblackfriday@reddit

my voice barely above a whisper

Joker: I'm not so serious. I'm putting a smile on that face.

[-]

TipIcy4319@reddit

> you're a human writer

As someone who writes stories with AI, they are still far from being able to write anything good, and with the current focus on coding, it hasn't changed much. Writers are one of the safest people to AI taking their jobs, if anything lol

[-]

adscott1982@reddit

I have been experimenting with creating a podcast using Gemini 2.5 for the script and it gets so tiring having to go through what it generates and removing these verbal ticks.

I am holding off generating anymore until Gemini 3, in the hope they solve it.

Here is my current checklist of items to look for and correct in the 2.5 script:

...

*** NOW FIX THE TEXT ***

Look for not this but that.

Look for wasn't this but this

Look for doesn't this but this

Look for no longer

Look for didn't

Look for 'very'

Look for 'let's'

Look for 'imagine'

Look for 'testament'

Look for 'incredible'

Look for 'masterpiece'

Look for 'brilliant'

Look for 'masterclass'

Look for 'world'

Case sensitive sentence starters:

Look for 'But'

Look for 'So'

Look for 'And'

Look for 'Now'

Look for 'Then'

Look for 'Because'

Look for 'To understand...'

...

[-]

SlapAndFinger@reddit

This pattern is from Gemini. It spread to other LLMs because Gemini was offering API keys with free inference, and businesses sprang up to basically scrape inference and resell the data.

I expect that the big labs will RL it out soon, as it's such a meme that they 100% know about it, it's probably just lower priority than other things they're currently focusing on.

[-]

n00b001@reddit

This isn't just a thread of people talking, it's poisoning training data for future LLMs

[-]

Zomunieo@reddit

This isn’t a post, this is a comment.

[-]

typical-predditor@reddit

"unearned superlative", such a beautiful term. I love it and it succinctly describes a ton of LLMisms—overly fanciful presentation.

[-]

a_beautiful_rhind@reddit

The synthetic data and rigid instruct really do a number. It's this and echoing all your input. Acknowledge, Embellish, Ask Follow Up. Absolute death spiral.

You can't say "well, they're all like that" because I have models that do neither. Thanks scale.com

[-]

WithoutReason1729@reddit

Your post is getting popular and we just featured it on our Discord! Come check it out!

You've also been given a special flair for your contribution. We appreciate your post!

I am a bot and this action was performed automatically.

[-]

Gold-Cucumber-2068@reddit

This isn't the end, it's just the beginning, and things will never be the same, to be continued... ?

[-]

LagOps91@reddit

I wanted to meme on this in the replies, but it looks like everyone else beat me to the punch. I was having a good laugh!

[-]

demon_itizer@reddit

This is not just an irritation anymore, it has become the single largest indicator of AISpeak. Think essays. Think papers. Think articles. The effects are not just small, but big.

[-]