yuicebox

Do I need to use Ollama to get the full feature set of GLM-OCR with a GGUF model format?

Posted by yuicebox@reddit | LocalLLaMA | View on Reddit | 8 comments
ChatGPT Python Sandbox requirements.txt

Posted by yuicebox@reddit | LocalLLaMA | View on Reddit | 0 comments
Email Sesame AI and express your disappointment

Posted by yuicebox@reddit | LocalLLaMA | View on Reddit | 19 comments
Confusing results using Exllamav2 with 72b models on 48gb (Q4/Q8 cache, max_seq_len)

Posted by yuicebox@reddit | LocalLLaMA | View on Reddit | 15 comments
What is your preferred front-end/back-end these days? (Q2 2024)

Posted by yuicebox@reddit | LocalLLaMA | View on Reddit | 79 comments
RTX4090 and 96gb 6400MHz DDR5 RAM - Can I get more than 2 tokens/second on 70b?

Posted by yuicebox@reddit | LocalLLaMA | View on Reddit | 36 comments
Llama 3 prompt format question

Posted by yuicebox@reddit | LocalLLaMA | View on Reddit | 2 comments