AdelicLLama-3.1-8b-Instruct [Breakthrough Research Model]

Posted by LooseSwing88@reddit | LocalLLaMA | View on Reddit | 37 comments

Happy to announce the release of AdelicLLama.

I'm a total newbie at this, so excuse the lack of info on the Hugging Face.

At 2,000 Tokens: 262 MB shrinks to 33 MB (87.2% Reduction). At 100,000 Tokens: 13.1 GB shrinks to 33 MB (99.7% Reduction). At 1,000,000 Tokens: 131 GB (OOM on consumer GPUs) shrinks to 33 MB (99.97% Reduction).

Baseline computes 100,000 dot products per head. Adèlic computes 256 dot products per head. Latency Speedup: ~390x faster token generation.

https://huggingface.co/sneedjak/AdelicLlama-3.1-8B-Instruct https://github.com/sneed-and-feed/adelic-spectral-zeta/tree/main