Osama_Saba
-
Q3 is absolute garbage, but we always use q4, is it good?
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 24 comments
-
People who don't enable flash attention - what's your problem?
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 46 comments
-
What do we use for real time English speech recognition with low vram
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 16 comments
-
Is vllm faster than ollama?
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Why no more progress in multimodals under 10b it's too slow I need something new or I sell my gpu not really joking but why
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Qwen 14B is better than me...
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 362 comments
-
Wait, can I abuse huggingface for storage for free??
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Chached input locally?????
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Wife running our local llama, a bit slow because it's too large (the llama not my wife)
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 74 comments
-
Gpt 4o-mini vs models
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 8 comments
-
Lm studio makes the computer slow for no reason
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 2 comments
-
I don't want thinking models
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 15 comments
-
I'm hungry for tool use
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 12 comments
-
I'm hungry for tool use
Posted by Osama_Saba@reddit | LocalLLaMA | View on Reddit | 1 comments
-
I want to know the reason for the go around but I can't understand what LOT 454 is saying
Posted by Osama_Saba@reddit | aviation | View on Reddit | 19 comments