Maxious

MiniMax-M2.1 Uncensored: PRISM Advanced Abliteration

Posted by Maxious@reddit | LocalLLaMA | View on Reddit | 12 comments
GLM-4.7-REAP-50-W4A16: 50% Expert-Pruned + INT4 Quantized GLM-4 (179B params, ~92GB)

Posted by Maxious@reddit | LocalLLaMA | View on Reddit | 74 comments
Run Qwen3-Next-80B on 8GB GPU at 1tok/2s throughput

Posted by Maxious@reddit | LocalLLaMA | View on Reddit | 5 comments
Surprisingly Fast AI-Generated Kernels We Didn’t Mean to Publish (Yet)

Posted by Maxious@reddit | LocalLLaMA | View on Reddit | 54 comments