Karpathy's MicroGPT running at 50,000 tps on an FPGA

Posted by jawondo@reddit | LocalLLaMA | View on Reddit | 34 comments

Sure, it's only 4,192 parameters, but it's a start. Project write-up here: https://v2.talos.wtf/ and github repository here: https://github.com/Luthiraa/TALOS-V2

Some of the speed comes from having the weights onboard, rather than in external memory. Onboard ROM means with 16 bit weights current FPGAs max out at 20-30 million parameters, but maybe this and Taalas (https://taalas.com/ - similar names are unlikely a coincidence) will lead to more onboard ROM appearing in FPGAs or FPGAs dedicated to SLMs.