Osama_Saba

Q3 is absolute garbage, but we always use q4, is it good?

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 24 comments
People who don't enable flash attention - what's your problem?

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 46 comments
What do we use for real time English speech recognition with low vram

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 16 comments
Is vllm faster than ollama?

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 10 comments
Why no more progress in multimodals under 10b it's too slow I need something new or I sell my gpu not really joking but why

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 11 comments
Qwen 14B is better than me...

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 362 comments
Wait, can I abuse huggingface for storage for free??

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 13 comments
Chached input locally?????

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 11 comments
Wife running our local llama, a bit slow because it's too large (the llama not my wife)

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 74 comments
Gpt 4o-mini vs models

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 8 comments
Lm studio makes the computer slow for no reason

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 2 comments
I don't want thinking models

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 15 comments
I'm hungry for tool use

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 12 comments
I'm hungry for tool use

Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 1 comments
I want to know the reason for the go around but I can't understand what LOT 454 is saying

Posted by Osama_Saba@reddit | aviation | View on Reddit | 19 comments