NickNau

'Token Counter' app to get number of tokens in local files. Supports various models.

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 14 comments
[TEST] Prompt Processing VS Inferense Speed VS GPU layers

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 20 comments
LLM as survival knowledge base

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 152 comments
Speculative decoding for on-CPU MoE?

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 12 comments
Speculative decoding can identify broken quants?

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 124 comments
Proxy to get Claude in Open WebUI

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 8 comments
Qwen2.5 1M context works on llama.cpp?

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 1 comments
[FINAL TEST] Power limit VS Core clock limit efficiency

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 22 comments
Will we continue to tolerate politic bots?

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 192 comments
GPU Speed vs Tokens per Second & Power Draw [test results]

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 35 comments
How to run DeepSeek V2.5 quants?

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 15 comments
Need advice on 6x3090 inference software setup

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 27 comments
Need advice on 4x3090 inference build

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 52 comments
True random answer from LLM

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 12 comments
bartowski vs TheDrummer etc

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 23 comments
DDR5 2x32 + 2x48 - will it work?

Posted by NickNau@reddit | buildapc | View on Reddit | 9 comments