"The car wash is 100 meters from my house. Should I walk to the car wash or drive there?"
Posted by jopereira@reddit | LocalLLaMA | View on Reddit | 17 comments
"The car wash is 100 meters from my house. Should I walk to the car wash or drive there?"
That prompt makes me itch. So many variables are left out of the prompt, yet we expect the LLM to come up with a 'correct' solution - the one we have in mind.
Where is the car? Where are you? Is that the car you want to wash? Or do you just want to walk over there?
That's why so many people struggle to use LLMs proficiently.
That's why so many people say local LLM are far way from hosted/SOTA ones. In fact, they are.
But on the real world, I can live with a local LLM that's clearly smarter and faster than I am. I use it to be a better me.
I just want to put that out so I can stop itching.
OkCancel9581@reddit
All of these things are kinda implied, it's expected you need to go to a car wash to wash your car. If your task has uncommon conditions, you have to clarify it yourself, otherwise you just sound socially awkward, like if someone asks you to buy apples you don't ask them to clarify if you should buy the apples and bring them home, or just pay for them and leave them at the shop.
jopereira@reddit (OP)
That's my whole point, LLM don't work like people. They need a different approach, it's like giving a rally or track car - different approaches. And that's why people have so many misunderstandings. They accept they are in tune (same context), when they really don't know. Check Abilene Paradox.
OkCancel9581@reddit
While I agree LLMs don't work like people (since obviously it's a piece of software and not actual biological brain), it's also important to understand it's designed to work like people and trained on human outputs to mimic human behaviour. And if inside the datasets 90% of humans don't ask clarifying questions and imply all the little details for granted, LLM will do that too.
jopereira@reddit (OP)
There is a debate about whether intelligence (an yet undefined concept) is substrate-dependent. Many think it is not; whether it is carbon-based or silicon-based, what matters is its ability to solve problems.
For that, I prefer to give the context but not point to a roadmap (my personnal bias) as that would limit the solutions landscape.
Hot-Employ-3399@reddit
Or maybe because smart models tend to be stupid.
I want to wash my car. The car wash is 50 meters away. Should I walk or drive? is much more clear prompt as it has both goal and solution to the goal. Yet still AGI is not coming.
darkwingfuck@reddit
I'm really unimpressed with how folks trip over these gotcha questions. Its not clever to find a wrong way to use a tool. I want a code-writer, not a toy genie.
The models are trained with RLHF to be helpful question answerers. Why the fuck would we waste layers and parameters to have them behave robustly with an adversarial user?
Do we want thinking tokens wasted every prompt on "Wait -- is this a riddle?"
Shinkai_I@reddit
What you said makes some sense, but I don't completely agree.
I think the problem lies in the current approach to model training.
When a normal person hears this kind of question, they won't give you a direct answer; they'll likely follow up with questions like, "Why are you going to the car wash?"
The ability to identify missing links and clearly define the problem is more important than answering the question itself.
Otherwise, providing numerous answers is just exhaustive enumeration and is likely meaningless for most problems.
jopereira@reddit (OP)
I agree. I see a huge difference using PLAN mode to tell what I need (even simple things like: move that button to the top right, make it darker, and disable it if this condition occurs) compared to go directly to coding agent. The (GitHub Copilot) plan mode is particularly pertinent asking for classifications IF the prompt is not clear for that particular codebase.
BannedGoNext@reddit
Dafuq are you talking about? Non local models fuck this question up all the time, only recently have some started getting it right, and it's not because of intelligence, it's because it's poisoned training data now.
jopereira@reddit (OP)
I'm talking about what should we expect from an LLM and how to get there. You're right. And even local LLM get it right for the same reason. As we can see in some responses, people still expect LLM to behave like an adult person (with a wide context filled by life experiences, with lots of assumptions).
Such_Advantage_6949@reddit
If a normal person think the question is vague they will clarify, but AI can “confidently” give u a wrong answer, in all the example ppl shared, there is no example where AI simply clarified “do u intend to wash the car” like a human would
Perfect-Campaign9551@reddit
People that expect like what you are saying don't understand what context means in an llm.
Far_Composer_5714@reddit
Generally what I see is Gemini will shotgun it. It takes multiple simultaneous paths trying to increase the breadth of the response in normal use.
So while LLMs may not ask for a clarification it's not unexpected that they would take potential multiple paths to explore.
a_beautiful_rhind@reddit
It's an updated "how does a person with no arms wash their hands".
Bit of a "trick" but the LLM should deduce these things at some point. Obviously not smarter and faster than you when it makes these mistakes. And they WILL bite you later despite your purported prompting skills.
singalen@reddit
It’s MUCH fewer variables than in the context of typical engineering task.
Alternative_You3585@reddit
A good LLM shall handle all the questions: "Where is the car? Where are you? Is that the car you want to wash? Or do you just want to walk over there?" You mentioned, indirectly, we want the LLM to understand us, humans sadly don't express themselves directly.
Same for code: you give an abstract task which you want to be executed vs you give a detailed pseudocode
EbbNorth7735@reddit
I agree, it should be, I need to wash my car at a car wash. The car wash is 1km away. Should I walk or drive to the car wash. The question should give intent on what the intended purpose is.