Every new large model release for cheapos...
Posted by Vektor-Mem@reddit | LocalLLaMA | View on Reddit | 10 comments
Posted by Vektor-Mem@reddit | LocalLLaMA | View on Reddit | 10 comments
LocalLLaMA-ModTeam@reddit
Rule 3
iMakeSense@reddit
r/povertyLocalLLaMA
sedikit-gila@reddit
Finally
Specter_Origin@reddit
Okay, that was just unexpected and uncalled for xD
bucolucas@reddit
12GB still feels like poverty eh? Guess I'll wait till I can spring for the 29-35B capable cards
SkyFeistyLlama8@reddit
There's VRAM and then there's weird janky-ass inference using barely supported laptop iGPUs or obscure CPUs.
Whatever works!
ready_to_fuck_yeahh@reddit
I feel violated
Long_comment_san@reddit
That's me with 12
Blastronomicon@reddit
Plastic-Stress-6468@reddit
Like I know you can measure KL divergence from unsloth's posts comparing their quants against other's, but I genuinely don't think I am able to tell that a quant is 5% or 10% "better" or closer to the bf16 version.
The moment I know which model is which confirmation bias is going to hijack my subjective perception. The theoretically better model is going to feel like it's performing better and vice versa for the technically worse model. And then I am going to have my memory further hijacked by that one star use case that left a random lasting impression on me for whatever reason.
Like maybe I can tell a q1 model from a q8, but I have never used a q1 or a q8 anyways. I can't fit the q8 and the q1 is probably just a waste of time.