Remove_Ayys

For llama.cpp/ggml AMD MI50s are now universally faster than NVIDIA P40s

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 160 comments
llama.cpp: Automation for GPU layers, tensor split, tensor overrides, and context size (with MoE optimizations)

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 68 comments
llama.cpp vs. vllm performance comparison

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 36 comments
llama.cpp vs vllm performance comparison

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 0 comments
Elo HeLLM: Elo-based language model ranking

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 4 comments
llama.cpp multi GPU support has been merged

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 12 comments
llama.cpp: owners of old GPUs wanted for performance testing

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 93 comments
llama.cpp FlashAttention: P100 owners wanted for testing

Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 25 comments