muxxington
-
brute-llama - A llama.cpp llama-server testbench
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 9 comments
-
Lads, time to recompile llama.cpp
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 56 comments
-
ðŸ˜
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 9 comments
-
Solution for Qwen3-Coder-Next with llama.cpp/llama-server and Opencode tool calling issue
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Conclusion: Sesame has shown us a CSM. Then Sesame announced that it would publish... something. Sesame then released a TTS, which they obviously misleadingly and falsely called a CSM. Do I see that correctly?
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 106 comments
-
Poor mans x79 motherboard ETH79-X5
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 20 comments
-
:|
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 18 comments
-
There it is https://github.com/SesameAILabs/csm
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 76 comments
-
OpenAI compatible API for Flowise
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Gottcha!
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 17 comments
-
Gottcha!
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
I managed to reduce Tesla P40 idle power consumption
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 40 comments
-
P40 still worth it?
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 16 comments
-
How to unlock cheap mining boards like the ETH79-X5 to support unsupported GPUs
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
How to unlock cheap mining boards like the ETH79-X5 to support unsupported GPUs
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
How to unlock cheap mining boards like the ETH79-X5 to support unsupported GPUs
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Is there an aider equivalent for sysadmins?
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 8 comments
-
Improved handling of multiple Tesla P40/P100 with multiple llama.cpp instances
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 15 comments
-
gppm now manages your llama.cpp instances seamlessly with a touch of kubernetes ...besides saving 40 Watt of idle power per Tesla P40 or P100 GPU
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 5 comments
-
gppm now handles your llama.cpp instances with a touch of kubernetes ...beside safing 40 Watt of idle power per GPU
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
gppm now launches llama.cpp with Tesla P40 or P100 with a touch of kubernetes
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
gppm now launches llama.cpp with Tesla P40 or P100 with a touch of kubernetes
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
For those who run multiple llama.cpp instances sharing Tesla P40
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Question on FP32 FlashAttention in llama.cpp
Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 8 comments