Make your AI talk like a caveman and decrease token usage

Posted by RegionCareful7282@reddit | LocalLLaMA | View on Reddit | 123 comments

Make your AI talk like a caveman and decrease token usage

I’ve been working on a little side project to help LLMs talk like… cavemen.
Why? To save tokens, of course.

It works because LLMs can easily fill in grammar and connectives on their own. So we strip what’s predictable, keep what’s meaningful, and the model still understands everything perfectly.

Store RAG documents in caveman-compressed form so each chunk carries more valuable data, fits more context, and gives better retrieval quality.

Thought I'd share it here as it might be beneficial in order to not waste tokens on unnecessary words :)

Feel free to contribute if you have any additions!

https://github.com/wilpel/caveman-compression