wombatsock

An LLM hard-coded into silicon that can do inference at 17k tokens/s???

Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 72 comments
Inference on new Framework desktop

Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 23 comments
Renting your very own GPU from DigitalOcean

Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 8 comments
"Refusal in LLMs is mediated by a single direction" - research findings on a simple way to jailbreak any LLM

Posted by wombatsock@reddit | LocalLLaMA | View on Reddit | 12 comments