MoMoneyMoStudy
NVIDIA has 72GB VRAM version now
Posted by decentralize999@reddit | LocalLLaMA | View on Reddit | 154 comments
NVIDIA has 72GB VRAM version now
Posted by decentralize999@reddit | LocalLLaMA | View on Reddit | 154 comments
MoMoneyMoStudy@reddit
GPT-OSS-20B is in the sweet spot for building Agents
Posted by sunpazed@reddit | LocalLLaMA | View on Reddit | 99 comments
MoMoneyMoStudy@reddit
GPT-OSS-20B is in the sweet spot for building Agents
Posted by sunpazed@reddit | LocalLLaMA | View on Reddit | 99 comments
MoMoneyMoStudy@reddit
Wow anthropic and Google losing coding share bc of qwen 3 coder
Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments
MoMoneyMoStudy@reddit
Wow anthropic and Google losing coding share bc of qwen 3 coder
Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments
MoMoneyMoStudy@reddit
Wow anthropic and Google losing coding share bc of qwen 3 coder
Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments
MoMoneyMoStudy@reddit
Wow anthropic and Google losing coding share bc of qwen 3 coder
Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments
MoMoneyMoStudy@reddit
Wow anthropic and Google losing coding share bc of qwen 3 coder
Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments
MoMoneyMoStudy@reddit
Wow anthropic and Google losing coding share bc of qwen 3 coder
Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments
MoMoneyMoStudy@reddit
Wow anthropic and Google losing coding share bc of qwen 3 coder
Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments
MoMoneyMoStudy@reddit
OpenAI GPT-OSS-120b is an excellent model
Posted by xxPoLyGLoTxx@reddit | LocalLLaMA | View on Reddit | 149 comments
MoMoneyMoStudy@reddit
OpenAI GPT-OSS-120b is an excellent model
Posted by xxPoLyGLoTxx@reddit | LocalLLaMA | View on Reddit | 149 comments
MoMoneyMoStudy@reddit
OpenAI GPT-OSS-120b is an excellent model
Posted by xxPoLyGLoTxx@reddit | LocalLLaMA | View on Reddit | 149 comments
MoMoneyMoStudy@reddit
OpenAI GPT-OSS-120b is an excellent model
Posted by xxPoLyGLoTxx@reddit | LocalLLaMA | View on Reddit | 149 comments
MoMoneyMoStudy@reddit
Bye bye, Meta AI, it was good while it lasted.
Posted by absolooot1@reddit | LocalLLaMA | View on Reddit | 432 comments
MoMoneyMoStudy@reddit
Bye bye, Meta AI, it was good while it lasted.
Posted by absolooot1@reddit | LocalLLaMA | View on Reddit | 432 comments
MoMoneyMoStudy@reddit
Bye bye, Meta AI, it was good while it lasted.
Posted by absolooot1@reddit | LocalLLaMA | View on Reddit | 432 comments
MoMoneyMoStudy@reddit
Google massively slashes Gemini Flash pricing in response to GPT-4o mini
Posted by Vivid_Dot_6405@reddit | LocalLLaMA | View on Reddit | 72 comments
MoMoneyMoStudy@reddit
Meta just pushed a new Llama 3.1 405B to HF
Posted by Accomplished_Ad9530@reddit | LocalLLaMA | View on Reddit | 52 comments
MoMoneyMoStudy@reddit
Meta just pushed a new Llama 3.1 405B to HF
Posted by Accomplished_Ad9530@reddit | LocalLLaMA | View on Reddit | 52 comments
MoMoneyMoStudy@reddit
AMD hopes to unlock MI300’s full potential with fresh code
Posted by No_Training9444@reddit | LocalLLaMA | View on Reddit | 31 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Anyone else find Llama 4 models kinda underwhelming?
Posted by Conutu@reddit | LocalLLaMA | View on Reddit | 103 comments
MoMoneyMoStudy@reddit
Anyone else find Llama 4 models kinda underwhelming?
Posted by Conutu@reddit | LocalLLaMA | View on Reddit | 103 comments
MoMoneyMoStudy@reddit
Anyone else find Llama 4 models kinda underwhelming?
Posted by Conutu@reddit | LocalLLaMA | View on Reddit | 103 comments
MoMoneyMoStudy@reddit
Anyone else find Llama 4 models kinda underwhelming?
Posted by Conutu@reddit | LocalLLaMA | View on Reddit | 103 comments
MoMoneyMoStudy@reddit
Anyone else find Llama 4 models kinda underwhelming?
Posted by Conutu@reddit | LocalLLaMA | View on Reddit | 103 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Snapdragon X CPU inference is fast! (Q_4_0_4_8 quantization)
Posted by Some_Endian_FP17@reddit | LocalLLaMA | View on Reddit | 76 comments
MoMoneyMoStudy@reddit
Quantize 123B Mistral-Large-Instruct-2407 to 35 GB with only 4% accuracy degeneration.
Posted by RelationshipWeekly78@reddit | LocalLLaMA | View on Reddit | 116 comments
MoMoneyMoStudy@reddit
Quantize 123B Mistral-Large-Instruct-2407 to 35 GB with only 4% accuracy degeneration.
Posted by RelationshipWeekly78@reddit | LocalLLaMA | View on Reddit | 116 comments
MoMoneyMoStudy@reddit
Quantize 123B Mistral-Large-Instruct-2407 to 35 GB with only 4% accuracy degeneration.
Posted by RelationshipWeekly78@reddit | LocalLLaMA | View on Reddit | 116 comments
MoMoneyMoStudy@reddit
Quantize 123B Mistral-Large-Instruct-2407 to 35 GB with only 4% accuracy degeneration.
Posted by RelationshipWeekly78@reddit | LocalLLaMA | View on Reddit | 116 comments
MoMoneyMoStudy@reddit
New medical and financial 70b 32k Writer models
Posted by mindwip@reddit | LocalLLaMA | View on Reddit | 100 comments
MoMoneyMoStudy@reddit
The software-pain of running local LLM finally got to me - so I made my own inferencing server that you don't need to compile or update anytime a new model/tokenizer drops; you don't need to quantize or even download your LLMs - just give it a name & run LLMs the moment they're posted on HuggingFace
Posted by AbheekG@reddit | LocalLLaMA | View on Reddit | 89 comments
MoMoneyMoStudy@reddit
Is the new DDR6 the era of CPU-powered LLMs?
Posted by AlexBefest@reddit | LocalLLaMA | View on Reddit | 146 comments
MoMoneyMoStudy@reddit
70b here I come!
Posted by Mr_Impossibro@reddit | LocalLLaMA | View on Reddit | 65 comments
MoMoneyMoStudy@reddit
70b here I come!
Posted by Mr_Impossibro@reddit | LocalLLaMA | View on Reddit | 65 comments
MoMoneyMoStudy@reddit
Llama 3.1 405B EXL2 quant results
Posted by Grimulkan@reddit | LocalLLaMA | View on Reddit | 62 comments
MoMoneyMoStudy@reddit
Llama 3.1 405B EXL2 quant results
Posted by Grimulkan@reddit | LocalLLaMA | View on Reddit | 62 comments
MoMoneyMoStudy@reddit
Llama 3.1 405B EXL2 quant results
Posted by Grimulkan@reddit | LocalLLaMA | View on Reddit | 62 comments
MoMoneyMoStudy@reddit
Llama 3.1 405B EXL2 quant results
Posted by Grimulkan@reddit | LocalLLaMA | View on Reddit | 62 comments