wombatsock
-
An LLM hard-coded into silicon that can do inference at 17k tokens/s???
Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 72 comments
-
Inference on new Framework desktop
Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 23 comments
-
Renting your very own GPU from DigitalOcean
Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 8 comments
-
"Refusal in LLMs is mediated by a single direction" - research findings on a simple way to jailbreak any LLM
Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 12 comments