ifioravanti

Apple M3 Ultra 512GB vs NVIDIA RTX 3090 LLM Benchmark

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 57 comments
Apple MLX Quantizations Royal Rumble 🔥

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 9 comments
Ollama performance on M2 Ultra - M3 Max - Windows Nvidia 3090 and WSL2 Nvidia 3090

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 7 comments
🔥 DeepSeek R1 671B Q4 - M3 Ultra 512GB with MLX🔥

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 209 comments
This M2 Ultra v2 M3 Ultra benchmark by Matt Tech Talks is just wrong!

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 67 comments
Llama 405B running locally!

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 63 comments
DeepSeek R1 Distill Qwen 7B Q4 large context (up to 128K) tests

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 13 comments
LM Studio - Hugging Face Model Manager

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 6 comments
🤔 Can the iogpu.wired_limit_mb setting on MacOS be persisted? 📣 YES!

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 6 comments
Best LLM with large context window? Not for coding in Feb 2024

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 1 comments
MLX now supports loading GGUF directly from Huggingface models

Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 1 comments