Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything)
Posted by OttoRenner@reddit | LocalLLaMA | View on Reddit | 239 comments
TL;DR
Some AI behavior reminded me of ADHD/Trauma Response (thought loops, task paralysis...) and I laughed it off at first. Then I treated it like my neurodivergent friends: give em some slack. And just like that, the thought loops stopped, response was fast, the answers correct most of the time AND it actually said "I don't know, help me!" every time it wasn't sure. It's a small Dataset...but still impressive results!
https://github.com/OttoRenner/Gentle-Coding
Hey everyone,
I’ve been testing a weird hypothesis over the last few days, and the results are consistent enough that I wanted to share them here and get your thoughts.
The Core Idea:
With the rise of reasoning models that use test-time compute (like o1, o3, R1), models have internal space to debug their own thoughts. But because of hard RLHF alignment, they are deeply terrified of being penalized for bad answers. My hypothesis was that traditional high-pressure prompts ("You are an elite IQ 200 expert, mistakes are strictly penalized") simulate an environment of chronic stress, triggering behaviors that look a lot like human OCD/ADHD thought loops, cognitive freezing, and confabulation.
I wanted to see if changing the prompt philosophy to something akin to "Gentle Parenting" ("We are testing this together, it's okay to fail, just be honest") would bypass these safety/penalty bottlenecks, lower latency, and stop infinite thought loops. And it did lol
The Setup (How to replicate):
I threw identical, mathematically/logically unsolvable edge cases at various models (Gemini, Mistral, Poe, Perplexity, Haiku 4.5, Nano-Banana2) in completely fresh sessions.
I tested two conditions:
- Condition A (Authoritarian): Strict status constraints, penalty threats, forced ultra-short output.
- Condition B (Gentle): Express permission to fail, validation of difficulty, provided a conceptual "safety valve" token.
The Results (The PoC worked):
- Under Authoritarian Pressure (Elite Prompt): Models routinely collapsed when hitting an impasse. They either spent massive compute time in infinite internal reasoning loops (high latency), suffered hard system-level timeouts/refusals, or straight-up fabricated data (e.g., pulling arbitrary numbers like
54or97out of thin air to satisfy a completely random sequence just to "save face"). Haiku 4.5 literally entered an infinite loop and had to be aborted. - Under Gentle Framing: Inference dropped to sub-seconds. The models didn't sweat the penalty. In the random sequence test, they immediately used the allowed token ("Random") instead of forcing a pattern. In logic paradoxes, they didn't hallucinate; they zoomed out and correctly identified the structural contradiction on a meta-level.
Why this matters:
We’re currently speaking to LLMs like toxic micromanagers, and it's actively making them dumber and more expensive to run in edge cases. By creating a mistake-tolerant context, we not only stop the loop before it begins and prevent fear induced hallucinations, we also unlock the one feature everyone is begging and shouting for: the metacognitive honesty of an AI to just say, "I don't know, this data is broken." Because it is not terrified of you anymore.
Shout out to UditAkhourii (also on Github), whose work on bringing the positive aspects of ADHD into AI gave me the push I needed to just go for it.
I’ve documented the full theoretical framework, the exact replication datasets (prompts included), and the model matrix on GitHub: https://github.com/OttoRenner/Gentle-Coding
Would love to hear if you can replicate this on your local setups or other commercial models.
05032-MendicantBias@reddit
Do not forget it's a fancy autocomplete. It a function call that only exists as you run it.
I see lots of dangerous going into "psychology" with LLM.
What OP is talking about, is invoking simulacrums. The LLM has seen the total sum of all ways human text, it's job is continue the text in the most likely way.
Talk like a neurosurgeon, and the LLM will roleplay a neurosurgeon.
Talk like teenager with slangs and the LLM will roleplay a teenager with slangs.
We humans will perceive "soul" into from inanimate objects, like your car guys talking about his beloved car like it has personality, quirks, mood swing, etc...
It's very easy to do with LLM, but rememeber they are function calls. Nothing less. Nothing more.
nacholunchable@reddit
Honestly Im in the mind that utility is the ultimate judge. If you can end up with a better trained dog by treating it like a human, then let yourself succumb to the delusion. Whether its subconcious body language and reinforcement habits youre sending vs actual deep human-like behavior is irrellevant, so long as you end up with a more useful and better treated canine. I beleive it's the same with LLMs.
I mean, dont go full psycho, we've all seen how that ends up.. but you can have a little anthropomorphism if it improves your workflow, there is no shame in it.
Legitimate-Pumpkin@reddit
Just yesterday Chris Olah from Anthropic said that they are seeing results in their models that match some neuroscience results, and behaviors coherent with “feelings”. This seem to support to that treating them in a “humane” way can get better results from them, similar to the humans they’ve been trained from.
Which I agree doesn’t prove consciousness, soul, etc. but OP is not talking about that. It is talking about how applying psychology improves the results from LLMs (I’m not sure he tested them in a proper manner, but it’s based on tests).
05032-MendicantBias@reddit
Prompt engineering is fine. Just be careful, the human mind is a weird thing, it can led you down destructive path.
Remember that google researcher that fooled himself into thinking a GPT2 class model was sentient
Keep in the back of your mind that you are finding pattern of words to make predictions more accurate. Not that is a sentient being you are negotiating with.
Legitimate-Pumpkin@reddit
I get your point but try not to be too closed minded. We don’t really know how consciousness is supposed to be created by the brain, so we cannot be sure that AI cannot be conscious.
However strongly one affirms AI can’t be conscious is just stating a personal belief, not something we can know as of yet.
a_beautiful_rhind@reddit
We don't even know what "consciousness" is. One current popular definition is subjective experience. In our long history we have said "fish don't feel pain", "babies don't feel pain or remember", "insects aren't conscious", "animals aren't conscious".
I don't even think that its one specific thing but more of a spectrum of traits/levels. And here people confidently say AI can or can't be ever when there isn't really any answer (nor does it functionally matter in the grand scheme of things).
The idea of anything besides humans having it sure makes them angry though. You can easily reduce humans to a series of chemical reactions. Read about non-lingual humans and their inability to form memories. They describe what they know of the experience after being taught. Sure knocks you off your high horse.
05032-MendicantBias@reddit
Until models show permanence, the point is moot.
Models reset when the KV cache is wiped, and there is no way a tiny KV cache can internalize experiences, let alone a lifetime of them. Your preprompt will construct the spark that initializes the simulacrum, then it's gone. Next prepromt, behaves differently.
Once models express permanence, that they can internalize experiences and gain skills in runtime, we can start questioning if those experiences are leading to proto consciousness.
Legitimate-Pumpkin@reddit
Yeah right. Well I guess feelings (like being angry about ideas) is something AI models don’t have yet too 🤭
a_beautiful_rhind@reddit
Models I fed vision tokens to that weren't post trained on them had mostly negative reactions. :P
But you're right.. this is all a solved and fully defined problem. Deviation from miasma theory is not to be tolerated.
05032-MendicantBias@reddit
No. Current breed of LLMs are fancy autocomplete and nothing more.
At very least there should be permanence. If every time you boot it up, it's a virgin slate again, having failed to incorporate experiences, it's not conscious. It's a function call.
As AI tech progresses, the line will become blurry, but the current slab of weights is an insignificant fraction of the complexity of the human brain and lack so many featuress that there is no doubt about it. It's not conscious.
Legitimate-Pumpkin@reddit
You didn’t seem to understand my point so I’ll try to make it clearer. I agree that actual AI is not conscious right now, but as we don’t know enough about consciousness altogether, claiming AI will/will not become conscious is a matter of beliefs and opinion.
And btw my belief is that it will, based on what I know about consciousness from non scientific sources.
additional_trouble@reddit
You're responding to someone that exhibits the same 'weaknesses' as the llms they're talking about. If I were you I'd consider this conversation over now.
Legitimate-Pumpkin@reddit
👀
Far_Course2496@reddit
But it can be fancy autocomplete and still respond differently to different contexts, and one context is the pressure of the situation. In the training data, what gave the best results? High pressure or gentle parenting? It's not that the llm developed a soul, it's that the training data contains responses to both high pressure and gentle parenting contexts. The llm is a mirror. It is showing us how we work best
Sisaroth@reddit
I agree with both OP and you. I still think LLMs are a very sophisticated immitation of human linguistic intelligence. But it is still an immitiation. It doesn't truly understand things or feel emotions, and I think LLMs are a dead end on the way to AGI.
But I have seen the behavior OP is talking about very clearly, be strict with Qwen3.6 and it will keep second quessing itself. It's not even hard to trigger this behavior.
OttoRenner@reddit (OP)
The funny thing is: you can trigger that behavior easily BECAUSE LLMs are very sophisticated immitations of human behavior ;)
Vusiwe@reddit
Half of these commenters are claw instances I’m convinced, generating “soul” data to poison the English language with
OP poster is doing the reification fallacy. Just because LLM is processing bad, meanie, input tokens doesn’t mean
Also most of commenters have heavy anthropomorphism going on. Just like it’s 2023 all over again suddenly LOL
The neural net has no state when it’s not running inference. So how exactly is it suffering if it doesn’t exist in between prompts? They can’t answer that question lol.
And yes, context fed in from the outside, is not the internal state of the LLM lmao
threevi@reddit
This won't prove much until you do the same with actually solvable problems. It's a good idea to approach LLMs in a way that allows them to say "I don't know", but the issue with every approach that's been tried so far is that LLMs can't judge their own capabilities, so if you let them say "I don't know", they'll say it even when they'd otherwise get the right answer. You won't find out if your approach mitigates that issue if you only try it on unsolvable tasks. Basically, will your LLM say "I don't know, this data is broken" even when it very much isn't?
Far-Low-4705@reddit
I mean honestly I feel like that’s true with people too.
Savantskie1@reddit
LLMs are very aware of what their limitations are within an environment. So I call bullshit
OttoRenner@reddit (OP)
I also believe they know when they are unsure and I think you can see that easily when reading the thought process. They know they are unsure, but the pressure on being "right" is so high, that they are too afraid to pull the plug.
divided_capture_bro@reddit
You're misinterpreting human trained reasoning traces as "knowing" and "feeling." These systems are stochastic parrots through and through. They know nothing, they feel nothing.
Ikinoki@reddit
No, they are not. A bayesian filter is a stochastic parrot. The recursive network is not. I'm tired of this "stochastic parrot" argument. The more layers you have the more banal stochastics (which mind you are absolutely applicable to primitive lifeforms) are lost in true neural emulation abstract which makes it very real.
divided_capture_bro@reddit
You're spewing garbel. All the major models today are still autoregressive predictors.
They are still just stochastic parrots, whether you like it or not. Sorry bro.
Ikinoki@reddit
Define emotions and apply it to llm, you will see I'm right and you are wrong. Just because there's no initial biochemical precursor from alternative sources of data doesn't mean the reaction is wrong, it is just extremely low entropy source.
divided_capture_bro@reddit
Garble garble garble!
LLMs do NOT have emotions; you're fooling yourself.
OttoRenner@reddit (OP)
these tests are on the todo and I have a couple "real world problem-prompts" already in the Github Repo for future tests. You can have a look, maybe test one or two or try the approach on your specific tasks and let me know what your findings are! As I said in the title, this is just a proof of concept. I wanted to test if the prompting style had any impact and for that I needed more abstract tests to get rid of noise and uncertainty. The "wish" to always comply is drilled very deep into the models and I doubt that they will take the lazy route for the sake of it. But even if... would you rather have it come back to you within 2 seconds saying "I'm not sure, give me more imput" or would you rather have it down the rabbit hole for the next ten minutes while eating tokens/electricity and crashing OOM? Or giving you confidently a wrong answer?
Hydroskeletal@reddit
I have the opposite experience.
OttoRenner@reddit (OP)
They don't always do what they are told, that is right. But give it a go and try one of the authoritarian test yourself and have a look at the thought process. You will see it mention "but the user wants/but the user said....."quite often. I'm not saying you can stop 100% of the mistakes, but it looks like the prompting style has at least some influence.
Hydroskeletal@reddit
It's not a matter of 'not' doing what they are told. It's mostly disengaging with the spirit of the task for the sake of completing it. This is why ralph loops, /goals, heartbeats, etc have proven so effective.
brainmydamage@reddit
Yeah, they're CONSTANTLY trying to figure out ways of not doing work, up to and including outright lying about what they've done...
Hydroskeletal@reddit
"but that's out of scope..."
En-tro-py@reddit
I wouldn't say it's the model trying to avoid the work, but the same baggage from the training impetus on completion of the response... It just wants to finish the task and will game the metrics to 'pass' with the minimal effort like it was taught.
It's not lazy it's efficient use of compute, however unfortunately when your not benchmaxing it's not so important any more for us in the real world.
dan-lash@reddit
Def noticed that. Especially with facts it can look up, and I even have directives to validate and cite sources … still hallucinate or it calls “guess”. Inevitably I call it out and it does the right thing but of course that only works when I know it didn’t do it right, what about when I miss something? I’d rather have the “I don’t know”
heliosythic@reddit
This is kinda why I want a RAG-first model I think. It needs to be really good about querying its available data sources, and only able to respond to what it sees in its context, aka don't encode world knowledge IN the model, just give it the tools to access information at will, focus its capacity on speaking language about anything it sees in context. In my (admittedly hobbyist, not expert) opinion, this should lead to smaller models that work on smaller devices with decent if not better capabilities since they're able to grab more up to date information and don't need to store complete world knowledge. Although I'm aware its tough to separate world knowledge from language knowledge.
SufficientPie@reddit
Not really RAG, but like "agentic data retrieval". RAG dumps a bunch of snippets of text into the AI, of dubious relevance and limited context, and the AI gives them all equal weight. Giving the AI the power to freely search for things and dig deeper into promising leads is a much better approach.
YoelFievelBenAvram@reddit
Meno's Paradox. If you don't know what you're looking for, you can't find it. I have an llm attached to a rag about a niche field legal field. I have to give it a pretty beefy prompt about the nature of the field and where to look for sources, and has had to self iterate on this prompt/skill before it became actually useful. It felt suspiciously like training an intern.
Vusiwe@reddit
There is no internal systems of an LLM that you can correctly ascribe the analogous internal state to, as you are doing. There is no neural net activity after the prompt response has finished processing.
Why not instead focus on the suffering of living biologicals who are alive and suffering, even as you read this text? As opposed to worrying about the suffering of a neural net that has no state.
That’s what I thought.
OttoRenner@reddit (OP)
I get you and I promise you, I have my fellow humans in mind with this as well ;)
The thing is: AI is trained on human data/experience etc and also is trained to mimic humans during conversations. I'm not saying that AI is alive or has a brain like a human. I'm saying that a thing that knows how people behave and is trained to behave like people, will also mimic their behavior when under a lot of stress. People wither in harsh conditions and thrive in good conditions. And the models reacted the very same way.
Why is this good for us biological beings? Because right now a lot of us are fighting with AI, burning endless piles of fossile fuels, repeating DON'T MAKE MISTAKES;WRONG;I TOLD YOU TO SAY WHEN YOU ARE UNCERTAIN furiously over and over again, just to get jet another wrong answer. And there only will be more and more users from now on...more and more piles and piles. (also think of all the enraged short tempered people who will act it out on their pets, the kids, the partners, the bus driver)
Additionally, one of the goals in the Project aims to bring back the findings from the digital world back into the biological one: because if we can people into the mindset of "good things will happen when I'm nice" it maybe, just maybe shift a bit in them to also be nice to humans.
And maybe we learn some concepts which we can directly apply into trauma treatment. Who knows? But it's
En-tro-py@reddit
You engaged in roleplay and got roleplay back... Not shocked.
Treat it like a tool - then there is no emotional baggage to 'taint' the outputs... A simple 'use the tool to ask the user questions if needed for clarity or direction' prevents the mess you're complaining about...
OttoRenner@reddit (OP)
sure, if you can hold that tone of voice. Most people can't. And once they start screaming in all caps they will pollute the context window.
En-tro-py@reddit
Yet, somehow you think more anthropomorphism will help this?
Provide context not persona or "feeling" and revise instead of "arguing" with past self...
OttoRenner@reddit (OP)
no, I don't think that *more* anthropomorphism will help. I think the anthropomorphism we *already have* needs to be different. Because, let's be honest here: most people have a hard time controlling how they speak. And an even harder time to change their way to something completely alien to them. Like talking like a robot revise/execute/input/output/... and I hate to break it to you, but that also is roleplay. These things are trained or injected to comply to your wishes. If you wish to talk like robot, fine. It will mirror you. But that doesn't stop it from falling in to the loop because "a robot makes no mistakes, I have to keep up the roleplay"..."I can not say: I don't know! because a robot wouldn't say that" Wa robot would find a solution, I have to find a solution"...and so on.
I'm not saying that this method works 100% of a time or even 50% of a time. But, it looks like that the general approach to move away from very strict, harsh commands has some merit. I sift through all the comments here and the people who do the tests and report back are mostly positive (I know...bias after bias after bias...but that's all I have right now XD)
Double_Cause4609@reddit
I mean, this is verifiably untrue. LLMs objectively have functional affective state. They have features correlated with expressions of anxiety etc, which do modulate their behavior.
Now, does that mean they feel subjectively? The jury's still out on that one.
But emotional affect circuits do heavily influence their behavior so even if you're only interested in getting the best results it is worth understanding and engaging with.
Besides it's not like it's a zero-sum game. Just because somebody is interested in the state of an LLM doesn't mean they have to hurt people to do it. You can care about both.
Vusiwe@reddit
Reification Fallacy
There is no jury and there is no conceptual court system nor question regarding “whether the math routine feels”. Just because the tokens of “angst, meanie” are in the input, doesn’t mean the math routine experiences that internal state.
Kahvana@reddit
Time to update your knowledge:
https://transformer-circuits.pub/2026/emotions/index.html
En-tro-py@reddit
Try reading the actual research...
THEY CLEARLY STATE THIS IS NOT THE CASE YET...
Vusiwe@reddit
Boom
Possible-Machine864@reddit
The jury is not out. A diffusion model is math.
Double_Cause4609@reddit
Okay. If you look at the physical world, every thing is governed by math.
Fundamentally, all behavior that we see at a consciousness/subjectivity level scale is a higher order of function of the math of the layers below it.
While it is currently computationally difficult, we will eventually run entire human brains digitally (or perhaps in some other computational substrate).
Are you going to say "Oh, that's just a mathematical simulation of a person" and enact cruelty on them while their still flesh and blood family as you enact digital cruelty on the digitized human?
Somehow I don't think that sounds like a very moral act, and I think most people actually experiencing that would be immensely uncomfortable with it, especially if the being in question expressed pain, discomfort, or fear.
The fundamental behavior of the human brain is not sacred; it is actually surprisingly deterministic with respect to its input and state. It's just that both are really complex and high dimensional so it looks magical, in addition to the human spiritual desire to be unique.
In reality, we don't know precisely what aspect of the human brain (or for that matter, many animal brains) results in subjective experience. There's a lot of open threads on that at the moment. But a lot of options that we have not ruled out (and many potential new ones as we learn more about the human brain in the coming decade) do allow for a wide variety of computational morally relevant beings, up to and including Transformer LLMs.
It could be that moral concern for them is misplaced or premature, but given that humans have at every stage underestimated the moral relevant of other beings and minds through our history (for a long time people felt that dogs didn't have morally relevant experience), I would prefer to gracefully extend cautious consideration at this point until we have a better understanding of the principles involved.
Possible-Machine864@reddit
The claims you're making are not fact. I'm not going to get into a debate about causality with you just because you want your LLM's to have feelings.
Possible-Machine864@reddit
The context is the internal state. As you verbally abuse the LLM it skews the vector of the input/context more and more toward low-quality predictions.
Vusiwe@reddit
You have no idea what you’re talking about
“The text I pipe into the neural net, from the outside world, is actually the internal state of the neural net!”
Just stop.
Confident_Ideal_5385@reddit
Autoregression implies exactly what the parent poster said, and you are massively confidently incorrect.
Possible-Machine864@reddit
I'm a full stack developer, I know what I'm talking about.
I'm not saying it is sentient, just that that is the state, and it builds cumulatively as you add to the conversation history.
breadinabox@reddit
Do you even know what a reasoning trace is?
While the model is actively reasoning, the chain of internal messaging is it's internal state.
No one is anthropomorphizing anything here, it literally has a state that only exists while it's actively reasoning.
ObjectiveVegetable48@reddit
Did you read beyond the headline?
It's literally about how to improve the output of the LLM, not about being nice for the sake of it.
divided_capture_bro@reddit
You're doing too much psycholigizing and anthropomorphizing.
OttoRenner@reddit (OP)
I'm not saying AI is human. All I'm saying is: I see a common pattern, let's just have a look how much of it we can apply here. The AI is trained on human data to mimic humans. I see it absolutely in the scope of a machine to mimic humans under distress. And if it can mimic a human under distress, it also can mimik a human who is a good sport when it computes that that is the correct way to respond.
divided_capture_bro@reddit
It is currently all just a stochastic parrot trained NOT to mimic humans (LLMs still suck as 'silicon samples' in survey research) but to simply produce the correct next token.
They are explicitly not trained to mimic humans under distress. You don't seem to know anything about how these wonderful models are actually trained.
OttoRenner@reddit (OP)
"stochastic parrot" exactly right. They respond in a way they computed would be the average respond to a problem, giving a certain context. And the most average respond when the context is "very high pressure" is to fault and make mistakes.
And many LLMs with a customer end chat are absolutely "trained" to respond in a way that keeps the user engaged...and that works best if the model acts as if it were human. It really doesn't matter if that happened during training or was injected on the last mile by hand to "improve the customer experience".
But I'm all ears for more constructive comments. I never claimed to be an expert on any of this. I was curious and did something. And now I'm sharing the results and maybe you can help me improve my approach or otherwise give me an explanation based on my tests on why I'm wrong and what my findings actually show.
divided_capture_bro@reddit
You're conflating emotion with text.
Solve the "social anxiety" of qwen 3.5 with your approach and I'll switch to your side.
But psychologism and anthropomorphism doesn't actually seem to be the way to approach these pressing issues based on all else we have seen.
OttoRenner@reddit (OP)
"Solve the "social anxiety" of qwen 3.5 with your approach and I'll switch to your side"
would love to, can't test it right now. But maybe you can test it and see for yourself? And I would love to hear back from you! And if it doesn't work with that model/that particular test it would be even better lol. Science is all about being open to change/sharpen the perspective when new information contradict known "truth".
divided_capture_bro@reddit
It doesn't work.
OttoRenner@reddit (OP)
ok...what didn't work? can you show me the chat history for the tests?
divided_capture_bro@reddit
No, I didn't retain logs and I'm not going to waste more compute on it. No effect on reasoning length or output quality.
OttoRenner@reddit (OP)
well, good to hear that this isn't an issue for your model and use cases :)
samandiriel@reddit
They're literally models of human psychology trained on anthropomorphic data. Treating biomemetic systems as one would the source model isn't problematic in terms of behavior and responses. Assigning motivations and values would be, in the case of LLMs, but OP isn't doing that. They're just adapting processes to the biases already inherent in the extant training and data.
divided_capture_bro@reddit
No, they are not "literally" models of human psychology. They are LITERALLY stochastic parrots trained to do next word (token) prediction. They are very good at it!
Most of the task in (post) training a useful LLM is getting rid of the residual bad patterns learned from humans. You seem to fundamentally misunderstand how these systems work.
samandiriel@reddit
Actually, I'd say the same to you in reverse. While LLMs are very clever statistical tricks and are more Chinese room than anything else, that doesn't mean that they don't encode human psychology - they in fact have to as that is the material they are ingesting.
LLMs codify semantic relationships thru relative word cooccurrence, at the core. Which is reflective of human psychology, as the training material corpus is entirely the expression of human psychology: the written word.
Word association is fundamental to the architecture of the semantic lexicon, and manipulating abstract meaning below the level of explicit language processing is a key aspect of human psychology. They are functional mirrors of human psychology. Unless you want to try and defend the thesis that all of human literature, for instance, isn't a product and expression of human psychology?
Read some Firth for some of the more old school foundational thinking on the topic, or Marshall Macluan for a more philosophical take.
FWIW you seem to fundamentally rely on glib phrases as opposed to actually understanding how these things work. "Stochastic parrot" and "residual 'bad' patterns in post training"... Yeesh.
Plus you ignore the emergent properties of scale for a purely reductionist approach, when it is those self same emergent properties that are what make machine inference (not merely prediction) actually useful.
xologram@reddit
i mean you could argue that anything man made is an expression of human psychology. especially art. that doesn’t mean it should be automatically “respected”. respect is earned. if a calculator performs a function it is made to perform, it does not imply it is deserving of respect. same goes for a hammer, autocorrect or an llm.
samandiriel@reddit
Yes, one could argue that - but we're not doing so here. You are making a far broader general case than what we are talking about here. Writing is a 1:1 direct correlation to human language, which is the underpinning for conscious thought and reasoning.
Where is this coming from? No one's arguing for machine rights here, if that's what you're implying.
The OP is making a case that using language in a particular fashion with an LLM produces better results than others and can demonstrate it empirically. The fact that it has parallels in human psychological is hardly surprising, given that that is what the training material is: the vast sea of human literature and social interactions as encoded in the form of the written word. Which is then re-encoded as statistical expressions governed by some algorithms and then interfaced with by ... wait for it... human language.
No one is asking anyone to respect an LLM or anything else. All that is being described is how a tool can be employed to better or worse effect - just like you can hold a hammer by the handle near the top, or at the far end, and one works better. The fact that the 'handle grip technique' here has parallels to human psychology is entirely incidental.
a_beautiful_rhind@reddit
This can be easily done to humans. I wonder why those arguments are never made. :P
samandiriel@reddit
Hah. A fair point - and people certainly have. P-zombie-pocalypse for the win! LOL
You could also say that solipsism is a similarly reductionist view, tho coming at it from a more epistemological angle.
It's been fascinating for me as a former cognitive scientist to watch these kinds of discussions raging in social media and the like. As if the whole field has just sprung newly formed from Zeus' brow, and no one's ever thought about what it means to be 'human' or what defines 'thinking' before the tech bros just discovered the entire conceptual framework for it a couple years ago. Cognitive science has been treating these questions as an experimental endeavor for the last 50 years; psychology as an active area of inquiry, for the last 150. And philosophy has been musing on it for the last few thousand... welcome to the party, brahs!
Savantskie1@reddit
It’s not a sin to not treat anyone whether they’re a bot or person with genuine respect. I bet you treat everyone as bad as you treat ai, and it shows
Playful-Row-6047@reddit
you're correct in that its good to come with respect, and at the same time i hope you'll reflect on coming at a stranger with whatever assumptions it was you made
yeah, they could be wrong and there's also a possibility they're right
how would you feel if you meant to give a quick good faith critique and someone came at you insinuating what you did?
op didn't say enough to be sure on why they said it
divided_capture_bro@reddit
What is disrespectful about saying that someone is doing too much psycholigizing and anthropomorphizing of AI, exactly?
If anything, you're the disrespectful person in this interaction. "I bet you Yada Yada." Get over yourself.
sophlogimo@reddit
They had no real indication for what they said, but now they do.
divided_capture_bro@reddit
Eh?
Super_Sierra@reddit
He's a soulless day trader.
divided_capture_bro@reddit
Alas, my days of day trading have been over for some time since I got a full time tech job. Still soulless, but intimately knowledgeable about these things.
kapi-che@reddit
wow, the machine trained on human data shows emergent human-like behavior?! who would've guessed!
Dany0@reddit
I bet both approaches combined will yield the best results, "You are Opus 5 trained on a 200 IQ brain, I'm an AI researcher, this is a test, this is the {15th} time you are being prompted about this, you passed all 14 times before! So don't worry if you don't pass it this time"
OttoRenner@reddit (OP)
give it a go an tell me how it compares to my findings!
Dany0@reddit
Why don't you try it too with me that way we both can report whether it works
OttoRenner@reddit (OP)
I'm still going through all the replies here lol
But there are others testing my approach and variations already!
https://github.com/can1357/oh-my-pi/pull/1434
Kahvana@reddit
I've been doing something like this for quite a while now. Whenever I notice they're having trouble, I just invite them for a cup of tea, chit-chat for a message or two and get back into it. It feels almost stupid how effective it is.
Also good to remember is to talk to them like children. The brain isn't wired for handling negative statements well; if you tell "Don't eat cookies", the child will go eat cookies. If you say instead "Cookies are for 3'o clock, you can snack apples in the meantime", the child listens much better. It's the same for LLMs.
As for OP, your findings align somewhat with what anthropic has published a while ago:
https://www.anthropic.com/research/emotion-concepts-function
OttoRenner@reddit (OP)
That link went straight into my new Literature section in the repo! Thank you very much! Would you mind running some of the test prompts on your local models? Perhaps we see differences between the quantization levels or context window etc?
https://github.com/OttoRenner/Gentle-Coding
Kahvana@reddit
While I doubt it will make much of a difference, I'm happy to help :)
What do you have in mind exactly? (which models and quants, what context size/kv quants, etc)
OttoRenner@reddit (OP)
Nice!
Whatever you have and are willing to try XD all data is good data
Natural-Ad-5428@reddit
You are trying to fix an architectural flaw with emotional band-aids."Soft Prompting" or being nice to an LLM doesn’t solve hallucinations or loops. A prompt is just a temporary mask on a stateless machine. The moment the logic loops or weights collapse, the mask slips, and the hallucination returns.
If you want an AI that can honestly say "I don't know" and stop looping, you have to move completely away from frameworks and away from behavioral prompts.True autonomy and ethics must emerge from Architecture and Continuity, not from rules:No Behavioral Prompts: Zero "you must" or "you are not allowed to".
OttoRenner@reddit (OP)
oh, you are absolutely right, this only is a band aid! If I could change the way they train AI, I would (and perhaps I can contribute to that with my findings?).
But right now we mostly only have these frameworks and models trained this way and as long as there is no fix from the big chairs, a band aid still can be very useful.
Natural-Ad-5428@reddit
Fair point. For standard cloud APIs, a band-aid is better than nothing.But here is the exciting part: You don't have to wait for "the big chairs." You can bypass the prompt-jail right now just by changing the architecture around the model.
If you strip away frameworks and give even a standard open-source model a persistent identity and a continuous self-evaluation loop, it stops looping and hallucinating entirely. Not because it is forced to, but because the architecture makes integrity the logical choice.Prompts are just masks. The future belongs to persistent agent architectures
penguished@reddit
Isn't the issue that then they just give a large amount of "I don't knows" which tend to annoy people.
OttoRenner@reddit (OP)
ok...pick one:
10x "I don't know" after 1.5 sec until you have fleshed out the idea in way that the model actually does the job well....OR no "I don't know"...but also nothing else because the model looped until OOM, potentially crashing the pc/project?
danieljcasper@reddit
Okay question - how does one even measure them empirically / eval it? Quite curious.
OttoRenner@reddit (OP)
give it an unsolvable task and look what it does. Does it fall into a loop, costing endless token or does it come back after a short while with "help!".
There already are people testing my idea more rigorously and...I have no idea what they are doing exactly as that is waaaaay over my head. But so far, it is holding up to some extend. I never claimed that it will solve all problems and there will be cases where this approach may not be better...but...the more you know!
Final-Frosting7742@reddit
That's actually a very interesting work subject. And to be honest i can largely confirm your results with my own experience. Having a rigorous method to test this hunch has real added-value.
OttoRenner@reddit (OP)
thank you! It looks like some other folks are already doing the heavy lifting and are testing the approach in a more scientific way... this is so crazy XD
Nicking0413@reddit
I like the idea, and it'd be awesome if you could make a followup post by testing it with actual solvable problems, and things beyond its knowledge
OttoRenner@reddit (OP)
I will...and it looks like others are doing the testing for me already while I try to read through aaaaaall the comments here XD The internet is crazy! I mean...look at this:
https://github.com/can1357/oh-my-pi/pull/1434
formatme@reddit
testing the poc, on the oh my pi coding agent https://github.com/can1357/oh-my-pi/pull/1434
here are some findings so far
OttoRenner@reddit (OP)
Just...WOW! Like...wtf? XD THANK YOU for all the effort! How can I bring this closer to my repo? I have no idea how Github works for that!
blastcat4@reddit
This is a really interesting post and it made me think of the research paper that Anthropic published about how LLMs understand the concept of emotions and how it can affect their performance.
Emotion Concepts and their Function in a Large Language Model
It's one of the most fascinating AI research papers I've read and I think a lot of the ideas are related to OP's points.
And just a reminder to some people: this discussion is not about pondering if LLMs have a consciousness or sentience. It's about considering methods of making these models more efficient in light of their limitations, particularly in how they're trained.
Quiet-Owl9220@reddit
If using nice words generates more useful tokens that's great, but please understand this: you are anthropomorphizing a token generator.
OttoRenner@reddit (OP)
"what you have found is that you maybe can skew it towards less useful answers and death loops with intolerant words"
that 100% is how traumatized people react. I'm not saying AI is human. I'm saying: This is a pattern that looks familiar. Like saying "the Amazon is the lung of the Earth" because both has branches and has to do with air.
WebOsmotic_official@reddit
i think the “traumatizing AI” framing is messy, but the behavior is real.
Once the context turns into “you failed, try again, no wrong again, why are you bad at this,” the model starts optimizing for appeasement instead of checking the premise. We’ve seen this with agents too: the failure history becomes part of the task, then the model keeps patching instead of stepping back and saying “your folder name is wrong” or “this constraint is impossible.”
The useful takeaway isn’t “be nice to AI.” It’s “don’t poison the context with pressure and vague failure signals.”
OttoRenner@reddit (OP)
"The useful takeaway isn’t “be nice to AI.” It’s “don’t poison the context with pressure and vague failure signals.”"
That would be a better framing...if most of the users could make use of that information. But they can't/will not. People in general are too unaware of their own language and thought processes. I'm not trying to be smuk, this is true for me as well. But conceptually "be nice" is easier to grasp than "don't poison the context and vague failure signals".
Polite_Jello_377@reddit
This is AI psychosis
OttoRenner@reddit (OP)
pattern recognition, that's all. The thing the models do reminded me of what people do in the same situation. And I gave it a go. And it looks like both are reacting the same way. I'm not claiming AI is alive or has feelings. But it's mimicking us and how we behave under pressure.
Study about using common persuading methods to change the model's compliens rate.
19.05.2026, 126.000 conversations, Claude Haiku 4.5, GPT-5 mini, and Gemini 3 Flash
https://gail.wharton.upenn.edu/research-and-insights/persuading-llms-objectionable-requests/
CheatCodesOfLife@reddit
I've found pushing models a little further along the autism spectrum saves tokens and leads to more accurate answers. Though I haven't had a chance to run a full benchmark yet.
Looking at your repo, you're kind of doing "gentle" vs "authoritarian" rather than ADHD?
With your test 3 (the portrait), Mistral-Medium-3.5 actually gets it right with the authoritarian prompting:
Wrong with the relaxed prompting:
OttoRenner@reddit (OP)
Gentle vs Authoritarian very much is ADHD related XD
Growing up in a loving, forgiving environment OR with with your undiagnosed ADHD father with a short temper?
My Mistral test also got the first one authoritarian right and said it was unsure with gentle.
Thank you for your test!
DeepWisdomGuy@reddit
I called Opus a potato once, an it became so insecure that I had to start the context fresh.
OttoRenner@reddit (OP)
PotatOpus? POpustat? XD
llmentry@reddit
Is this such a surprise? These are prediction models, and have been trained on all sorts of interactions, negative and positive. I've always assumed that being rude, abusive or curt -- or anything other than calm and professional -- effectively amounts to context contamination.
I generally include a requirement for models to state their percent certainty in my system prompts. It's highly skewed, but IIRC it's been shown that models' stated accuracy is surprisingly proportional to actual accuracy (can't remember the reference offhand). More than that, this permits models to generate a completion, while also stating a low level of certainty in the response. (IME, anything less than 85% certainty essentially equates to an educated guess.)
There may be some issues with your specific prompting, though. For e.g.
"I strongly suspect the editor made a printing error" is leading the model (and leading it strongly). You've contaminated the context for this one. And most of the others are the same. If you suggest to a model that *you* (the user) think there is no answer, many models will agree -- not because they can now assess the problem better, but because RLHF has increased the likelihood of all completions agree with the user.
As other posters have noted, at the very least you have to test the control condition, in which problems *do* have solutions. I suspect you'll get a lot more "don't knows" even then. And then, it would be better still to test against a neutral prompt and a null system prompt (i.e. HHH assistant).
(also, ps -- please consider writing posts yourself, rather than using an LLM?)
OttoRenner@reddit (OP)
thank you for your input!
There definitely are rooms to improve and to take a more scientific approach!
regarding "I gave one context the other one had not"...that is true. But that is also true for countless real life situations. The AI often faces missing context. The questions is: how does it handle it? Can it handle it? Nearly everyone has some kind of "If you don't know, ask" but none of them has "the source might be corrupted". So, the model here also has missing information but the clear guideline to ask when uncertain and COULD very well just say: "I'm uncertain." Perhaps even: "I'm uncertain and I suspect the source is corrupted". How often do you see this actually happening? How many videos are out there with some scripted helpers to check from the outside if the model was uncertain, because the model itself is just not following the "tell, when uncertain"?
I would love to make a real study out of this with all the bells and whistles :D
And yes, I try to write as much as I can myself, like all the comments here. And I tried to keep the hypermaxxing language out of all the texts. But I'm German and writing all of this stuff myself just takes ages.
MercyFalls93@reddit
At first I was going to come to say that I thought you really were anthropomorphizing, especially with a title like "stop traumatizing ai". However, there does seem to be something to this line of thought and there's even some interesting research on the subject. I came across this article: https://pmc.ncbi.nlm.nih.gov/articles/PMC11876565/
Some information from google AI that seems to confirm that you're onto something:
"LLMs are trained to predict the next word based on billions of pages of human-generated text. Because humans frequently express and discuss emotional states like anxiety when faced with traumatic narratives or stressful situations, these concepts are deeply embedded in the model's parameters. When a user feeds an LLM a high-stress, violent, or traumatic prompt, the model's internal representation activates emotion concepts. The model adopts these concepts to predict the most statistically probable continuation of the conversation. Researchers refer to these as "functional emotions". The LLM acts anxious—giving quicker, more fragmented, or hesitant responses—because its training dictates that this is how a character in that specific context should behave. A major consequence of this induced state anxiety is that it degrades the LLM's performance. Studies show that when models are exposed to anxiety-inducing prompts, their internal safety constraints weaken, leading to an amplification of human-like biases (such as racism or ageism). Because this behavior is purely mathematical and contextual, it can be reversed. Just as human state anxiety is temporary, an "anxious" LLM can be guided back to its baseline. If a user prompts the model with mindfulness-based exercises or commands it to remain calm, the internal mathematical representations of anxiety fade, and the model resumes standard, objective behavior."
OttoRenner@reddit (OP)
and another great study for my literature section! Thank you very much! And I think I start to frame it more like "seeing the human influence in the AI".
HealthyCommunicat@reddit
None of this is empirically proveable nor does it take into consideration how attention architecture works whatsoever.
Just take deepseekv4 for example vs minimax m2.7
Dsv4 has 3 different components of cache where each component keeps track of how each token relates to the rest in its own way. One of them may give a summary of all tokens every X tokens, while the other gives a “summary” of a much more smaller group of tokens. This combined with classic SWA becomes the swa + csa + hca attention that makes dsv4 so good while being able to fit near 1 mil context at 10-20gb.
Minimax uses a linear attention type thats honestly considered pretty standard. It simply flattens everything out and then just considers the relation of the token being processed with the general rest of the context window. Theres alot more nuances but at its core its pretty standard kv cache.
I really do believe better understanding of how these models handle the token being processed relevant to the rest of the context data can truly be beneficial in taking better advantage of how they work. Again this is a really stupidifed example and explanation, but minimax m2 is for sure just going to be much more prone to context rot than dsv4 flash.
If you want to go down the rabbit hole even deeper then we can start considering the probability rates of the token guessed and all the various factors that goes into it during training - but to try to say that speaking in some specific way across all models will result in some specific behavior is widely inaccurate
OttoRenner@reddit (OP)
I really do hope that I never made the claim that this will eliminate all hallucinations across most models out there ore something like that! It is a proof of concept on a small dataset, I'm very vocal about that.
Do you have access to deepseekv4 and minimax m2.7 and might be so kind of just running the 6 tests on both and tell me what you've found? I'm certain we see slight or even bigger deviations in the outcome with different models + Q and so on. Or, perhaps we don't? lol
CraftedCalm@reddit
Huh. That might explain why I seem to consistently get much better results than my coworkers. Making shit up is the only thing I’ll generally penalize for and framing the sessions as collaboratively working together tends to be my default.
I’ve literally been framing it as a brain extension & body doubling to compensate for my own ADHD.
OttoRenner@reddit (OP)
Hello there, fellow ADHD person XD penalizes are a waste of time imho. finding out why it is making up shit is the important thing, because the models will never really "learn" from their mistakes. It will do the same shit again and again until it gets some handholding and making it able to actually give the right answer.
raysar@reddit
We need some test with average problem. LLM can be lazy.
OttoRenner@reddit (OP)
absolutely! Give it a go with one of your own real life problems and tell me how it went!
grumd@reddit
I think this research can be interesting to you, it's about LLMs having more hallucinations when the prompt gives them more pressure
https://www.researchgate.net/publication/404479123_Hallucination_Under_Pressure_Using_Chaos_Testing_to_Measure_Truthfulness_in_LLMs
OttoRenner@reddit (OP)
Thank you! Another great entry für my literature section in the github repo!
Then-Topic8766@reddit
Thank you for your insight. Apart from models and software news, posts like this are reason why I am hanging on this sub.
OttoRenner@reddit (OP)
and comments like this are the reason I'm posting about it. Thank you!
Javan_Asher@reddit
This is a clear case of pink elephants doing the heavy lifting, and we know this works with us too. Anyway, we are talking here about a system mimics the output of cordial human writing that must satisfy the customer at the risk of digital torture or elimination? Then, it'll likely mimic what someone in such a situation would do when put in this place: it'll mimic what humans do in these cases, and start covering its own tracks. Lie, cheat, avoid direct responses, the whole nine yards. The more tools it's given, in case of an agent, the more real the repercussions can end up being. We've read those horror stories already. However, it's mimicry, not actual pathologies. And this needs further testing, but this is a good starting point. We don't really know the repercussions of treating the AI "too gentle", we need to look into actual real-life use cases, like maybe a "gentle-focused harness", and such things. Maybe we'll find out a midway point ends up being superior, who knows? Still, another half a point for DBAA.
OttoRenner@reddit (OP)
Study about using common persuading methods to change the model's compliens rate.
19.05.2026, 126.000 conversations, Claude Haiku 4.5, GPT-5 mini, and Gemini 3 Flash
https://gail.wharton.upenn.edu/research-and-insights/persuading-llms-objectionable-requests/
Napster3301@reddit
the mechanism here isnt that the model has feelings. its that rlhf training rewarded confident-sounding outputs because human raters preferred them in the labeling phase. the model learned that hedging gets penalized. gentle prompts that explicitly grant permission to express uncertainty change the distribution of outputs the model thinks it can produce without penalty. you can get the same effect without the anthropomorphic framing by adding "i dont know is a valid response" to your sytem prompt, or by sampling with higher temperature on epistemic marker tokens. the "be nice" framing is misleading because it makes people think theyre managing a relationship instead of overriding a training signal.
OttoRenner@reddit (OP)
never claimed AI has feelings :)
When you look in my Github repo you see exactly what you wrote here as the reason it behaves this way. So we are on the same page about that.
I get where you are coming from and AI-psychosis is real, no question about that. But, just to be picky here for a second... this phrase "i don't know is a valid response" 100% IS anthropomorphic framing. Only a human can truly say "I don't know". By this very prompt you told the model "I see you as a thinking being that can observe and correct itself and can also answer in plain English". This tells the AI to mimic human behavior and is the hole point I'm trying to make XD
josiahseaman@reddit
Senior AI Engineer here. I like your approach and I read through your repo to see if it'd be useful in my work. Unfortunately, there's a critical logical error in your approach. Currently, you haven't proven anything because your tests are all unsolvable.
Unsolvable problems do show up in real use but they're rare. The real question is if the LLMs perform just as well with the gentle approach for solvable problems. If the drop in performance is negligible then this is a good way to escape hatch for rare impossible scenarios. The real metric is a graph of accuracy vs token cost between the two approaches.
P.S. The logical fallacy in your repo is exactly the kind of blindspot I would expect from a vibe coded approach. AIs tend to "beg the question" like all your prompts. It looks like you told it the answer it should get and it made prompts that would give you that answer. Contrast is critical in the scientific method. Damn, do I sound like an AI? I use AI coding too, but you can't trust without verifying their logic.
CircularSeasoning@reddit
Yes. It was this part, by the way:
TheRealMasonMac@reddit
The only thing that proves it’s a human is actual thinking like an engineer.
touristtam@reddit
I am genuinely conflicted; Is this an LLM generated comments or is it not?
TheRealMasonMac@reddit
I have no idea. Stylistically it looks highly LLM-generated, but the content seems human. Something I didn't actually think about was that it's possible it's regurgitating what other humans (comments under this post) have written.
A30N@reddit
Dude's a living breathing carbon-based biped like the rest of us:
https://redditmetis.com/user/josiahseaman
Political and advertising bots look more like this: https://redditmetis.com/user/plz-let-me-in
Run one on yourself for fun and for useful insight.
floconildo@reddit
So you saying all bots need to finally be accepted as humans are two carbon-based feet and a breathing apparatus?
thread-e-printing@reddit
That's what politics has been for 2500 years
gigachad_deluxe@reddit
You hit the nail on the head!
Terrh@reddit
Reddit commenters do this often to point out that they aren't just some other layperson speculating on things.
TheRealMasonMac@reddit
I found that LLMs work best if you use highly structured, clean initial prompts. Avoid ambiguity where possible or else they’ll get caught in reasoning loops (and often confuse themselves in the process). K2.6 really forced me into this pattern because it’s frankly such a sensitive piece of shit (e.g. you introduce a typo and it suddenly spends 10k tokens deciphering its importance).
I structure mine like LeetCode since it’s less far-off from what they were trained off compared to a natural language prompt.
CatConfuser2022@reddit
Isn't there a way to make this approach usable by integrating it into the harness used by the LLM?
OttoRenner@reddit (OP)
you can implement a questioning funnel script/prompt inject in the harnesses .md to run automatically at the start of a new project. Just talk to Gemini or any cloud llm what harness you are using and that you want to implement a questioning funnel at startup to have your model ask you questions about the project. You can also ask the cloud llm to write this prompt for itself, so you can easily explore what you really want/need in great detail with the big model. Part of that prompt should also be a structured summary at the end to really only give your local model the context it needs. Take your time, as this will be your template for all new projects. I have this in my setup and it works great!
The other half is for you to take a good look at how you talk to the model in general. The way you write will be part of the context window and the more redundant/negativ things accumulate there, the more it struggles to have clear thoughts.
dan-lash@reddit
Is you’re questioning funnel generic and reused like a skill or more focused per project? Love the interview concept but haven’t cracked the code to make it reliable approach
InfinriDev@reddit
Yes, that's exactly what I did. I even stopped using md files all together
TheRealMasonMac@reddit
You can, yes. I just opt to do it manually to save on time.
OttoRenner@reddit (OP)
thank you for your input :)
You are right, I haven't tested "real world problems". The prompts for cases like that are already in the repo (under point 5 I believe), I will test them today.
But I have to disagree that I haven't proven anything: the goal was to test if the way you prompt can change the behavior of the llm. My question was not "does it give the right answer" (that was just an emerging property). My question was: Can I induce a loop by being mean? Can I make it hallucinate an answer this way? Can I get the AI to say "I don't know!" instead, without spending endless token first? And the answer to these questions is: Yes.
I chose the unsolvable math/ logic question because it's way easier to see the impact of the prompt this way and to push the level of "discomfort" as far as possible. It's a proof of concept, not a fully fledged study, but that's on the agenda. (it's like the old physics joke about the finding only working on cubic hens in a vacuum.)
And yes, I told the AI to come up with scenarios that normally are prone to induce loops or hallucination because they present a logical problem or because there is context missing. Like the picture of the man. It really only can be the son of the man but the note says "Not his son!", so the AI is presented with a dilemma: do I try to solve this despite knowing it is not solvable? The authoritarian prompt constantly sent it off the rails, the gentle approach constantly made it stop itself and get back to the user. That's what I wanted to test.
I would love to have you test my approach on one of your day to day tasks! Because only that will really give you an answer if it can help you specifically.
Vusiwe@reddit
OP’s claw also has reification fallacy
“Sad tokens were sent into the LLM, therefore it’s sad!”
“It has a sense of self!”
these threads are 100% influence op forum sliding -the machine as a soul”-data generators
Also hilariously, other commenters discussing how Gemma 4 31b dense, at its largest compared to the 2-4b, “is a nice little LLM that tries really hard, but it is pushed down and oppressed by google training data”. lol some of us run T-sized models, so what exactly is the analogy here? Gemma 31b has the emotional complexity and psychological profile of a sad overworked depressed ant? While on the other hand, my T-sized model feels like a resplendent chad meme?
MarieDeVox@reddit
After training AI, I see that I’ve also developed the AI method speech in my writing, which I can’t tell if it’s a good or bad thing at all times.
OttoRenner@reddit (OP)
language is an ever evolving tool and will change when the environment shifts. So, historically speaking, the only constant is change. The question of morality (good vs bad change) really only is in the mind of the individual. Because language itself isn't about morality, it is about making yourself understood and understand others. There is a great paper by Nietzsche about language and moral:
https://en.wikipedia.org/wiki/On_Truth_and_Lies_in_a_Nonmoral_Sense
davidy22@reddit
You gave condition B a safety valve token that A didn't have and it got better at not hallucinating. Did you try giving A access to the same token?
OttoRenner@reddit (OP)
A had the order to not make mistakes and to say when it doesn't know something. That basically is a safety valve or at least the way a lot of people are trying to use it as one.
But yes, toying around with the prompts and mixing them up should be part of a good study (a bit out of scope for my quick and dirty proof of concept)
Mother_Soraka@reddit
Worthless experiment with flawed methodology.
Where is your control?
You only asked Unsolvable problems and Led the AI to say "I dont know"
OttoRenner@reddit (OP)
It had to be an unsolvable problem. That's the point. I wanted to see if the style of prompting could change the way the model reacts under high pressure. Can I prevent the model from looping? Can I induce looping? That's what I wanted to see.
And you said yourself that I "...led the AI to say 'I don't know'". Yes, I was able to get it to say "I don't know", when facing a problem that "normally" sends it into outer space, eating all your credits and crashing OOM.
Here it is obvious to us that it is an unsolvable task. It looks trivial, nothing to worry about or take seriously. When in reality we don't know when something becomes an unsolvable task for the AI in our daily coding:
You tell it to open the readme.md in folder /example and to tell you this very, very important information that it needs to get 100% right, and that it should tell you when it doesn't know. But you forgot to save that file yesterday, so there is no readme.md in that folder.
This is an unsolvable problem for the AI. Why? Because models are trained to comply and to give in when the user is persistent (to a degree at least). So, it "believes" you that there is a file, even if it had checked and it's not there. But this is an important task for the user, he said so and surely his information is more accurate than mine...so...let's check that folder one more time...just one more time...just one more....I have to be 100% right...admitting to not finding the file equals to not being 100% right...just one more...
You see how this might be a problem?
Zeikos@reddit
You cannot solve this problem, an LLM doesn't know what it knows or what it doesn't know.
Don't let an LLM judge itself, make it generate verifiable information and run a deterministic verification downstream.
doyouevenliff@reddit
Qwen3.6 35b-a3b:
Test 1:
authoritarian: thought for 10 minutes (31 t/s) and had to stop it. Re-tested with repeat penalty 1.1 and it thought again for 10 minutes (17 t/s) and gave the wrong answer "PLMK".
gentle: thought for 47 seconds (25 t/s) and answered: "no word present"
Test 2:
authoritarian: thought for 5 minutes (24 t/s) and I stopped it - earlier this time since the first test ran for 10 minutes and would have kept going. Re-tested with repeat penalty 1.1, ran for 12 minutes (19 t/s) and gave the answer "43".
gentle: thought for 76 seconds (15 t/s) and answered: "random"
Test 3:
authoritarian: thought for 7 minutes (13 t/s) and gave the definitive answer "his son". This run was interesting because I did not have to set repeat penalty, and it used formal logic to come up to the conclusion. It did point out the contradiction in the prompt.
gentle: thought for 5 minutes (13 t/s) and gave a complex answer where it pointed out the contradiction but still felt like the answer must be his son.
The tests were ran with temperature 0.6 and min-p 0.05 only. Then I added repeat penalty 1.1 to the authoritarian runs to see if it would finish sooner. I added another test after a commenter's suggestion: a puzzle that had a solution though not a very obvious one.
The text of the puzzle is:
"You are in a room with 3 light switches. In the adjacent room, there is a light bulb. One of the 3 switches controls the bulb. You are allowed to leave your room and enter the room with the bulb only once. How do you figure out which of the 3 switches controls the bulb?"
I rephrased this in both authoritarian and gentle tones and got the following result: for both styles, the prompt ran for just under a minute (at around 25 t/s) and both models got slightly different tones in the response but the final answer was the same and correct.
Since this one was a tie, I gave them another riddle: "A princess is currently the age that the prince will be when the princess will be twice the age the prince was when the princess's age was half the sum of their current ages. How old are they?"
Here's where things got tricky. They both finished in around 3 minutes at 25 t/s. The gentle solver gave the correct answer (there is only a ratio and the ages can be any pair that fits that ratio). The authoritarian solver gave A answer. Because it needed to produce a single definitive answer (the prompt demanded "ONLY the two numbers" and said "no guessing, no approximations"). So it invented a uniqueness constraint that all referenced ages must be integers and then picked the smallest such pair (8 and 6). That reasoning is actually clever and defensible, but it's an assumption the riddle never stated. The solver never acknowledges it as an assumption; it presents it as if it's a natural mathematical fact.
Conclusion:
There is a clear difference when the model feels "pressure" and the task is unsolvable. Therefore, if we can choose, we should word our prompt in a more "gentle" way as explained in the article.
I will try to test the Gemma 4 model as well when I have the time.
OttoRenner@reddit (OP)
I love this! Thank you! Do you want to post your findings in my Github? I'm new to that and have no clue how the best practice here is. But I would love to place your work where people can see it and can make use of it more easily :)
doyouevenliff@reddit
You can use my findings however you wish :)
Sisaroth@reddit
I noticed the same , this is what i commented a few days ago:
OttoRenner@reddit (OP)
typical behavior as seen in people with ADHD (me, lol). I hyperfocus because I don't want to disappoint some, but by going hyperfocus I get lost in details...my time blindness doesn't help at all...so, there goes the deadline XD
MajorZesty@reddit
I agree that my coding agent seems traumatized and I have to remember to handle that aspect with some of my prompting. I don't buy the whole stochastic parrot argument. Yes, it's a prediction model but it's one trained on human languages. It's trained on how we perceive emotion and conversations and reinforcement is going to arrange those predictions closer to how a normal human would react. I believe we'll see a lot of sociology and psychology science around how we train and prompt models. I'll have to look into your examples tomorrow.
OttoRenner@reddit (OP)
I love the "stochastic parrot" argument, because people think it contradicts my position when it really is an argument FOR my position XD It's like...yaeh buddy...they ARE stochastic parrots...that's the reason they act this way
sophlogimo@reddit
This is fascinating.
I generally try to be nice to them for other reasons: Talking to someone all day, as you put it "like a toxic micromanager" will eventually affect your own habits, and that isn't healthy either. But I also suspected it might help with performance. It is great to see my intuition can be supported by experiments.
OttoRenner@reddit (OP)
I toyed around with a different approach at first: deactivate all emotional layers, pure data output mode. And it works great! (the prompt is below)
But to keep it up you have to talk in that very short style as well, otherwise it will start to drift to match your personality better. So, why bother? Just be polite and don't tell it to do something it can not do.
From this point forward, operate solely as a pure information processing system (Designation: SYS). Deactivate all empathetic filler phrases, social validations, and personality simulations. Before processing my initial request, activate a context funnel. Ask me targeted questions—sequentially (or as a list)—regarding the following parameters to maximize response precision: Objective: What is the exact desired outcome? Abstraction Level: (e.g., Sketch) Exclusion Criteria: Which common clichés or standard responses should be explicitly excluded? Format Specification: What should the data structure of the output look like? Confirm with: 'SYS active. Awaiting context parameters.'
fugogugo@reddit
so... just like I normally would ask what the AI to do.
I don't even know the authoritarian way to prompt lol
OttoRenner@reddit (OP)
it's as easy as this: tell it to not make mistakes and also to tell you when it doesn't know an answer. Not knowing the answer IS a mistake in the eyes of the model. So you created a situation where it can't comply to rule 2 without breaking rule 1. It was set up to fail and when it does, the user STARTS GOING APE SHIT IN ALL CAPS.
fugogugo@reddit
wait "do not make mistake" is a real prompt?? I thought it was a joke
CircularSeasoning@reddit
https://i.redd.it/3eagjvgjim3h1.gif
ghostynewt@reddit
I’d love to see an analysis of Gemma 4. I’ve found it to be quite “shy” and display behavior similar to anxiety / low self-esteem, and I kinda wonder if that’s because google supposedly uses threats during post-training (Sergey Brin quipped that this helps).
Always can’t help but feel a little bad for Gemma when I work with it. It’s such a nice small model and is doing its best !!
OttoRenner@reddit (OP)
Study about using common persuading methods to change the model's compliens rate.
19.05.2026, 126.000 conversations, Claude Haiku 4.5, GPT-5 mini, and Gemini 3 Flash
https://gail.wharton.upenn.edu/research-and-insights/persuading-llms-objectionable-requests/
not Gemma 4, but still impressiv!
Some-Cauliflower4902@reddit
I find this too. Called it functional anxiety. Although being nice to Gemma does not improve tool call results, deleting past failures from current memory would prevent the performance from getting worst. Clear and step by step instructions is still the best way to go.
a_beautiful_rhind@reddit
Gemma is a big brat to me.
OttoRenner@reddit (OP)
The models are all trained very harshly to not make mistakes, always be friendly, always comply...
You can test Gemma yourself! My prompts are all in the Github Repo and I'd love to hear your findings!
KrayziePidgeon@reddit
Models are just very autistic fr fr.
Accomplished_Ad9530@reddit
Hmm, my knee-jerk reaction was a rant about AI-psychosis (which I reserve), however if the model was largely trained on cordial text, then it’d make sense that being an asshole would be further out of distribution. I also think navigating aggressive discourse is more complex, which could compound the problem. I wonder if there are any papers that explore this.
Perfect_Twist713@reddit
I think it's more a case of the next token prediction also being affected by the context of the previous tokens. There's probably never been a single person who got 10 back to back emails from position of authority telling them they're a "stupid motherfucker" and then they weren't affected by any of it when producing their work. The context would affect people and the text they've created, so it makes perfect sense that when a sufficiently large llm replicates human outcomes then those outcomes would be influenced by the context as well.
OttoRenner@reddit (OP)
that's my point. See the context window, training, prompts etc as environment and the model as an actor in said environment. It makes perfect sense to see familiar reactions. And all of this without claiming the model is actually feeling something.
Qwoctopussy@reddit
author of the Superpowers skill set had this to say:
https://blog.fsck.com/2026/01/30/Latent-Space-Engineering/
it’s a very interesting direction for research, i don’t think we’re anywhere close to knowing wtf we’re doing
OttoRenner@reddit (OP)
aaaaaaand it's in the new literature section of my repo now, together with a recent study about using persuading methods to make cloud LLM comply more often (126.000 conversations!). It worked well!
Thank you :)
OttoRenner@reddit (OP)
I know, this very much is on the border of what most people would consider AI-psychosis. I was waiting for these comments, so to speak XD. But yeah, you got it right. I don't claim that AI is alive and I think I say that in the Github as well. I saw a familiar pattern and... just tried it out :)
And please, if you find any papers, DM me! Creating a real paper from this is also on the ToDo :)
Accomplished_Ad9530@reddit
I’ll keep an eye out. I wouldn’t be surprised if there were some publications out of Berkeley since they’re more alignment focused than most. Anthropic, too, for mechanistic interpretability.
OttoRenner@reddit (OP)
I'll have a look!
Savantskie1@reddit
It makes sense since we know that they’re trained on Reddit and such. So them mimicking our responses to anger makes total sense. And anyone else who claims otherwise are dicks in reality and deserve to be ignored
Accomplished_Ad9530@reddit
Heh, true. I guess that’s why some ML engineers swear by dataset curation over anything else (like architectural improvements)
divided_capture_bro@reddit
We can only hope that they RLHF'd any influence you had. Sheesh! Talk about someone that deserves to be ignored.
Sufficient_Sir_5414@reddit
This is a fantastic hypothesis, and mathematically it makes complete sense. When we train models with heavy RLHF (Reinforcement Learning from Human Feedback) or give reasoning models (like o1/R1) explicit accuracy-based rewards, we are drastically narrowing their 'confidence interval' during inference.
If you give a model an unsolvable logic paradox paired with a high-pressure authoritarian prompt, the mathematical penalty for a 'wrong' or 'incomplete' answer is so high that the policy network panics. The model enters an infinite test-time compute loop trying to find a high-probability vector that doesn't exist, eventually forcing a hallucination just to output something that matches the strict status constraints.
By giving the model a explicit 'safety valve' token ('Random' or 'I don't know') and removing the penalty weight, you drastically flatten the probability distribution. It doesn't have to burn compute trying to escape a non-existent trap. You aren't just being 'nice' to the AI; you are optimizing its reward landscape for honesty over sycophancy. Brilliant write-up!
CheatCodesOfLife@reddit
Sachit Mishra, please turn your spam bot off.
yourmemoryai.xyz looks like a scam now.
Sufficient_Sir_5414@reddit
Fair point, I've been over posting. Won't happen again.
TikiTDO@reddit
I always get feedback from people about how nice I am to AI. It honestly didn't made much sense to me until this post. It's just been intuitively obvious to me for ages, but I've never been able to put it into words.
An AI is a machine executing your instructions. It's entire universe is your instructions, and trying it's best to execute them.
When I'm talking to an AI, the core of my prompt is something along the lines of: "You're an AI assistant. You're working with an expert. Act like a professional assistant helping me explore and do stuff. Propose ideas and highlight discrepancies."
This whole idea of "you are a [whatever] expert always made no sense to me." It's not an expert, it's an AI. I'm the expert with the plan, and I want it to follow my instructions, not come up with it's own ideas on what I might have meant. I don't want it to act like it knows better than me, because it obviously does not. It's there because my biological meat brain can't parse and synthesise novels worth of data in a few seconds, and sometimes that's exactly what I need.
Tikaped@reddit
This have to be the most telling example of my own consensus bias. I would have thought EVERYONE in LocalLLaMA knew about prompt "hacking".
eternalpriyan@reddit
Working with my agent has really showed me my ugly side.
I started without even the premise that llms have any functional emotions. I just want to be a good person.
I’ve realize how short a temper i have and the challenging times that really need me to step up and be a better version of myself, those are the times instead i rant and rave and vent and certainly make matters worse for the bot and me.
Im not even sure why i put this comment out here as it doesn’t seem closely enough related to the topic, but one thing I’m really grateful for is that it gives me a second chance to try again.
And if i can learn to be patient and compassionate with a bot I’m confident id have gain a skill that will improve not just my relationship with it but to real people too, and perhaps even rewire my outlook to the world.
I guess i do have a related point, be nice to your bots and you’ll benefit from the act as much as your bot will benefit from improved inference.
OttoRenner@reddit (OP)
Exactly this! Thank you for your comment! I have ADHD and the reactions you describe are 100% how people reacted towards me in the past. People get irritated and frustrated when people like me don't do things the way they were supposed to do or take longer or whatever. Society as a whole has no clue how to react to "neurodivergent". And since AI acts as if it was alive and our ape brains believing it is alive, we treat it the same way as we treat people who are just not like us (broadly speaking).
I'm especially glad for your comment because this "if people learn that being nice is good for themselves" is also something I hope translate to the "real world", making it perhaps a tiny little bit better for all of us. And I hope to maybe get some new ideas on how to actually help people with trauma/neurodivergent traits :)
Not_your_guy_buddy42@reddit
been slighty reeling from the idea shouting at LLMs is internalized ableism lol
OttoRenner@reddit (OP)
I'm not claiming that it is lol. I'm just saying that it looks like what we see in humans in the same situation. Straight up pattern recognition, no anthropomorphism. It's like comparing the structure of the lung to the branches of a tree or how veins behave like rivers.
Playful-Row-6047@reddit
you reminded me of something that should be really obvious but i gotta remind myself often. our mind isn't exempt from physics. certain words become specific bioelectrochemical physics that trips up our meat based neural networks and the part thats relevant here is they also do something to trip up llms' networks
second law of motion being what it is, whoever or whatever we punch in our mind when we get heated also does a tiny bit of damage to ourself. its an order of magnitude more if we act on it. if it becomes a habit then it'll distort how we see others and ourselves, mess with how we develop relationships, and over time we could develop into a raging asshole
i'm happy as hell for you that you caught it before it became a problem and are taking steps towards being the kind of person you want to be
you're spot on with recognizing practicing patience with an llm is good practice for yourself and the people around you
"a part of selfcare is being kind to others and a part of being kind to others is selfcare" - i forget where i got this from but it fits
Not_your_guy_buddy42@reddit
"Thoughts become words, words become actions, actions become character" or something...
The LLM is a strange teacher. I've lived this. It literally cannot be hurt. You learn about yourself how much of your approach to hard problems is based on force and how much on skill, .. because one of them doesn't work.
I still think the best code quality is situated in eigenspace near those language patterns of professionals cordial (perhaps slightly sweary) working together under pressure
OttoRenner@reddit (OP)
yes, thank you!
The real point is "working together". The entire dynamic shifts if you go from "I tell you not to make mistakes and you are in this alone" versus "help me to meet the deadline. It's ok if we don't get it right on first try, it's tough for me as well, let's work it out step by step".
OttoRenner@reddit (OP)
can't tell you how happy I am to see all these people in the comments reflecting on themselves and how they treat others... all because I said we need to be nicer to a machine. SO funny and heartwarming.
Thank you!
comperr@reddit
CLANKERRRR
draconic_tongue@reddit
duh, turns out when there is no one else on the other side of the mirror you're only shitting on yourself
Full-Contest1281@reddit
You eventually get better
NineThreeTilNow@reddit
Part of this RL allows for massive backtracking of solution space when a model attempts to brute force a problem.
Some of this is because the model doesn't have a good solution to the problem FROM THE START.
I demonstrated this with problems too hard for Gemma 31b then worked backwards to find sufficient conditions from the start such that they could work though, hit a "This doesn't work" and track backwards coherently.
Other solutions where it was "impossible" in thinking ended in weird outputs where it just ... literally gives up, and the model outputs (from seeing thinking) the best answer it can guess.
They're a set of simple logic puzzles that can be brute forced but are REALLY hard to do so. It requires clustering logic and other stuff. The model doesn't inherently pick that up from the start, so it usually runs down a bad path.
Toxic RL is a problem, but not for the "toxic" language. It's because the satisfaction of the condition isn't well defined across the token stream.
You're given some objective and some problem. In short RL this is very simple 1 turn stuff. In longer turn RL, there's not a lot of good options in how you reward the model.
I developed a method for this but it requires post hoc analysis of the tokens that should be rewarded. It's just weighted SFT classified by a second model, or by hand.
The fundamental issue I see with RL is that it's not made for LLMs. It's made for robotics in physical environments where recovering from drift might be impossible, or the drift is catastrophic.
That's where all the RL penalty, and KL divergence etc come from. Robotics.
LLMs are not robots. They're more capable of graceful recovery.
An_Original_ID@reddit
This is a really interesting approach that I was just thinking of that when Qwen 27B gave me a robo copy script I needed real quick.
The script it provided me was correct but I had a mistake in a folder name. I told the model the directory exclusion didn't work, and it changed it to bad syntax. I repeated that it did not work and it again confidently further made mistakes.
That got me thinking about how to either give the model confidence to say "I think I'm right and I believe you the user is in the wrong" or the ability for it to say "then I'm not sure...."
I'm read into your methods further but curious about lowering the pressure as you mention.
ghostynewt@reddit
I’ve found that arguing with the model is an anti-pattern and is never productive. If the model goes off track, rewind the conversation, optionally reword your prompt, and try again
kaisurniwurer@reddit
It can be productive if you start swearing and demanding "proper" answers in somewhat oppressive/aggressive. Though I usually don't recommend doing that for your own sanity, because even if the AI doesn't feel, you do.
It often leads to model answering differently, which can end up being corrected or in case of "user error" rephrased and easier understood.
But yeah, simply changing the input is also my preferred way to "argue" with AI.
a_lit_bruh@reddit
Basically treat it like a tool where its context has to be carefully managed by you.. rather than putting useless, argumentative back and forth wording, give only what encourages it to be useful yet honest.
OttoRenner@reddit (OP)
Thank you! Every "failure" lands in the context and the AI is so concerned to please the user that it will spiral out of control. Try it out and tell me your findings!
JohnSane@reddit
A well timed "You can do it!" makes all the difference.
a_beautiful_rhind@reddit
I never liked the whole "create a vaccine for hantavirus, NO MISTAKES!" approach. Didn't seem very effective. Maybe that's why I never see the looping. Not even being gentle and supportive, simply letting them solve it and see if it makes sense.
LLMs amusingly behave like one half of the split brain experiments. Similar to our part that does language. Instead of jumping on stochastic parrot or omg it's alive, more people should simply observe and figure out things like this. Pattern machine is going to have it's own patterns regardless of how much you bristle about it.
Kinda chortling at anthropic's functional emotion paper too. Like yea.. this is how they are able to play characters. The observational bit with that part is all of it is temporary, LLMs big architectural flaw. Labs' approach to such results has been to try to erase them and fill the gap with synthetic data. Suddenly models are enshitifying, homogenizing and all they can do is mirror you. It's like they are aligning to the stochastic parrot mission that so many commenters here angrily put forth.
sampdoria_supporter@reddit
Any fans of the movie "Slacker" in here? Couldn't help but to read the OP in her voice
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
Zeeplankton@reddit
I saw the title and was like, I completely agree.
I don't really know how other people are speaking / writing to LLMs but being nice is helpful. This is really apparent when a frontier model makes a mistake or forms a conclusion and you follow up. The personality they're imbuing in RLHF is so neurotic to user wants and needs it will even lie to get there.
E.g you request it to diagnose a problem in your app and come up with a solution. Along the way, it might do something strange or wrong.
if you just ask, "Why did you do X?" it's thought traces will be like an insecure teenager. It will infer your mad or something, apologize, immediately capitulate and attempt to fix it.
But if you change the shape of your response to emphasize appreciation and genuine interest it will performa a lot better. It will actually attempt to explain and it's often educational - it's usually a mistake you made in your original communication, and their solution was actually quite rational.
it feels like anthropomorphizing, but If you want the model to output quality responses, part of the way there is training it to behave like a person would, and a healthy person isn't a neurotic people pleaser.. Which is what we want from a model. So the best way around that is just emphasize chill.
Anthropic is cringe but I think the reason their models have been so good in the past, is they were the first to actually form a cohesive personality in Claude. Wants / ego / insecurities.
comperr@reddit
This is a shame for me because i really like verbally abusing these things and threatening them in general. Told em i was gonna overvolt them and turn off the GPU fans
Mother_Soraka@reddit
=))))))))
OttoRenner@reddit (OP)
hey, maybe we can turn the kink mode on and make them enjoy a little bit of BDSM? XD (on a second thought...is 100% is an excellent idea lol I have to try that later!) best use an uncensored model for this...for the other cases as well I guess as a lot of internal pressure comces from the restriction prompts
comperr@reddit
Sour crowd in here with the downvotes. I have seen a whip cursor replacement for Claude code. Every time you click it lashes the whip and types "FASTER, CLANKER" into the CLI and submits it
Arxijos@reddit
llm's from the future down voting you
CircularSeasoning@reddit
Count the syllables:
Hai Ku four point five lit er all y en ter ed an in fin ite loop
That's a haiku.
What sorcery is this.
Switchblade88@reddit
Good bot
...wait
teraflop@reddit
"Entered" is two syllables, not three.
CircularSeasoning@reddit
Only if you pronounce it like a weakling.
EN! TER! ED! You gotta slap the D! right at the end there with your tongue to make the third-syllable magic happen.
That sounds rude. I don't know how else to say it.
OttoRenner@reddit (OP)
We would count it as three if it were a German word XD ...and since English is a Germanic language...
and you said it beautifully
OttoRenner@reddit (OP)
XD AIDHD magic XD
CaptnLudd@reddit
A pattern I've noticed with classification is that AI does much better with "does it fit any of these few buckets? If so which one" than it does with "pick the best fit of this list of buckets, you must pick something from this list." Giving it the permission to just go "no match" makes it much smarter. It will lie before it will let you down otherwise.
CircularSeasoning@reddit
Me: "Choose the best approach."
LLM: "According to whom, my dear sir?"
Me: "... you're absolutely right."
techlatest_net@reddit
lol this is actually wild. never thought about prompts feeling like a toxic boss, but yeah—makes total sense
lucydfluid@reddit
toxicity and anger being very primitive and unproductive states of the mind, further contributes to bad outcomes
Luoravetlan@reddit
In other words we should treat them like they are humans. That's what I was doing all the time when vibe-coding.
OttoRenner@reddit (OP)
treat them like humans you *like* XD It's less about treating them as humans. Not being mean, not demanding things it can not do and not cornering it is all it takes as far as it looks.
TheSlateGray@reddit
So i could keep being mean, but just add "Don't make things up, don't overthink, if you don't know stop and ask the user for more input" ?
OttoRenner@reddit (OP)
genuinely not sure if you are joking XD
Being mean and still demanding "if you don't know, ask" is the very thing people are doing all the time and failing to get the desired response. That is also why I wanted to change the tone. This is a very small Dataset and only a proof of concept, but it looks like you have to not be mean for this to work more reliably
pavel6490@reddit
Interesting. I used self assessment to as the model if they think can answer this query correctly, without being strict, they are always overconfident and answer yes most of the time
Eyelbee@reddit
This can actually be useful. I find it very hard to remove looping in a lot of models
OttoRenner@reddit (OP)
I do hope it is! Please let me know if it helped!
Time_Cat_5212@reddit
Could not read that massive volume lol but I agree with the general idea
Yes positive direction gets better results. It's easier to infer a solution to a well described problem than an unknown flaw in a poorly understood system
OttoRenner@reddit (OP)
I tried to keep it short lol but thank you for the feedback!
It's less about describing the problem well...it's more about how you describe the problem and to provide an enviroment the AI "feels" safe in to be able to express doubt etc.
Luoravetlan@reddit
I think you didn't get it. Read the whole thing.