Is it possible to run some simple LLM (e.g. llama2) using very low amounts of RAM (e.g. 16MB)?

Posted by galapag0@reddit | LocalLLaMA | View on Reddit | 28 comments

I'm thinking if it is possible to run a small llama2 LLM in MSDOS as some fun side project. In theory, it is possible to compile it using OpenWatcom2 (assuming we change the C file to be C98) and replace the mmap call by a malloc (but dealing with the limited memory). Any hints?