Remove_Ayys
-
For llama.cpp/ggml AMD MI50s are now universally faster than NVIDIA P40s
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 160 comments
-
llama.cpp: Automation for GPU layers, tensor split, tensor overrides, and context size (with MoE optimizations)
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 68 comments
-
llama.cpp vs. vllm performance comparison
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 36 comments
-
llama.cpp vs vllm performance comparison
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Elo HeLLM: Elo-based language model ranking
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 4 comments
-
llama.cpp multi GPU support has been merged
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 12 comments
-
llama.cpp: owners of old GPUs wanted for performance testing
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 93 comments
-
llama.cpp FlashAttention: P100 owners wanted for testing
Posted by Remove_Ayys@reddit | LocalLLaMA | View on Reddit | 25 comments