OcelotOk8071
Introducing Gemma 4 12B: a unified, encoder-free multimodal model
Posted by johnnyApplePRNG@reddit | LocalLLaMA | View on Reddit | 93 comments
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
Posted by OcelotOk8071@reddit | LocalLLaMA | View on Reddit | 20 comments
OcelotOk8071@reddit (OP)
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
Posted by OcelotOk8071@reddit | LocalLLaMA | View on Reddit | 20 comments
OcelotOk8071@reddit (OP)
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
Posted by OcelotOk8071@reddit | LocalLLaMA | View on Reddit | 20 comments
OcelotOk8071@reddit (OP)
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
Posted by OcelotOk8071@reddit | LocalLLaMA | View on Reddit | 20 comments
OcelotOk8071@reddit (OP)
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
Posted by OcelotOk8071@reddit | LocalLLaMA | View on Reddit | 20 comments
OcelotOk8071@reddit (OP)
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
Posted by OcelotOk8071@reddit | LocalLLaMA | View on Reddit | 20 comments
OcelotOk8071@reddit (OP)
Google AI Edge Gallery v1.0.13 & v1.0.14 updates: Gemma 4 Multi-Token Prediction, Pixel TPU support, experimental MCP, new skills, now saves chat history
Posted by AnticitizenPrime@reddit | LocalLLaMA | View on Reddit | 38 comments
OcelotOk8071@reddit
PSA: If you haven’t updated Llama.cpp for a couple of days and find MTP to not be performing well, update llamacpp.
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 34 comments
OcelotOk8071@reddit
I hope that someday we will have a 124B Gemma.
Posted by cgs019283@reddit | LocalLLaMA | View on Reddit | 77 comments
OcelotOk8071@reddit
[Paper on Hummingbird+: low-cost FPGAs for LLM inference] Qwen3-30B-A3B Q4 at 18 t/s token-gen, 24GB, expected $150 mass production cost
Posted by ayake_ayake@reddit | LocalLLaMA | View on Reddit | 56 comments
OcelotOk8071@reddit
Kimi K2.6 Released (huggingface)
Posted by BiggestBau5@reddit | LocalLLaMA | View on Reddit | 277 comments
OcelotOk8071@reddit
GPT Image 2 finally killed the 'yellow filter'—everyday Chinese scenes are usable now
Posted by TroyHarry6677@reddit | LocalLLaMA | View on Reddit | 5 comments
OcelotOk8071@reddit
I made the "Mafia" Party game with leading LLMs to see how good these SOTA models are at social deduction and manipulating
Posted by Cyrax21_@reddit | LocalLLaMA | View on Reddit | 4 comments
OcelotOk8071@reddit
How are you handling output inconsistency in local LLM setups?
Posted by nipundwivedi@reddit | LocalLLaMA | View on Reddit | 1 comments
OcelotOk8071@reddit
Released Qwen3.6-35B-A3B
Posted by NewEconomy55@reddit | LocalLLaMA | View on Reddit | 93 comments
OcelotOk8071@reddit
Me right now
Posted by -dysangel-@reddit | LocalLLaMA | View on Reddit | 10 comments
OcelotOk8071@reddit
What it took to launch Google DeepMind's Gemma 4
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 136 comments
OcelotOk8071@reddit
What it took to launch Google DeepMind's Gemma 4
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 136 comments
OcelotOk8071@reddit
What it took to launch Google DeepMind's Gemma 4
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 136 comments
OcelotOk8071@reddit
Are we currently in a "Golden Time" for low VRAM/1 GPU users with Qwen 27b?
Posted by inthesearchof@reddit | LocalLLaMA | View on Reddit | 117 comments
OcelotOk8071@reddit
Are we currently in a "Golden Time" for low VRAM/1 GPU users with Qwen 27b?
Posted by inthesearchof@reddit | LocalLLaMA | View on Reddit | 117 comments
OcelotOk8071@reddit
Are we currently in a "Golden Time" for low VRAM/1 GPU users with Qwen 27b?
Posted by inthesearchof@reddit | LocalLLaMA | View on Reddit | 117 comments
OcelotOk8071@reddit
Am I doing something wrong? Or is Qwen 3.5VL only capable of writing dialogue like it's trying to imitate some kind of medieval knight?
Posted by Parogarr@reddit | LocalLLaMA | View on Reddit | 24 comments
OcelotOk8071@reddit
Am I doing something wrong? Or is Qwen 3.5VL only capable of writing dialogue like it's trying to imitate some kind of medieval knight?
Posted by Parogarr@reddit | LocalLLaMA | View on Reddit | 24 comments
OcelotOk8071@reddit
llama.cpp build b8338 adds OpenVINO backend + NPU support for prefill + kvcache
Posted by stormy1one@reddit | LocalLLaMA | View on Reddit | 13 comments
OcelotOk8071@reddit
Qwen 3 32B on M2 Max 32GB — my honest 3-week assessment
Posted by Budulai343@reddit | LocalLLaMA | View on Reddit | 16 comments
OcelotOk8071@reddit
WTF? Was Qwen3.5 9B trained with Google?
Posted by powerade-trader@reddit | LocalLLaMA | View on Reddit | 11 comments
OcelotOk8071@reddit
Qwen3.5 2B giving weird answers
Posted by Dean_Thomas426@reddit | LocalLLaMA | View on Reddit | 10 comments
OcelotOk8071@reddit
70B llm on 4gb android phone !
Posted by Vast_Lingonberry7259@reddit | LocalLLaMA | View on Reddit | 25 comments
OcelotOk8071@reddit
Any hope for Gemma 4 release?
Posted by gamblingapocalypse@reddit | LocalLLaMA | View on Reddit | 38 comments
OcelotOk8071@reddit
Speculation: new Gemma, Granite, Arcee Trinity models when?
Posted by RobotRobotWhatDoUSee@reddit | LocalLLaMA | View on Reddit | 7 comments
OcelotOk8071@reddit
4x4090 build running gpt-oss:20b locally - full specs
Posted by RentEquivalent1671@reddit | LocalLLaMA | View on Reddit | 96 comments
OcelotOk8071@reddit
My LLM trained from scratch on only 1800s London texts brings up a real protest from 1834
Posted by Remarkable-Trick-177@reddit | LocalLLaMA | View on Reddit | 174 comments
OcelotOk8071@reddit
My rice emergency supply has been destroyed by bugs.
Posted by WoodgladeRiver@reddit | preppers | View on Reddit | 172 comments
OcelotOk8071@reddit
OpenAI teases to open-source model(s) soon
Posted by ResearchCrafty1804@reddit | LocalLLaMA | View on Reddit | 113 comments
OcelotOk8071@reddit
Do any of you have a "hidden gem" LLM that you use daily?
Posted by ForsookComparison@reddit | LocalLLaMA | View on Reddit | 53 comments
OcelotOk8071@reddit
How are people using models smaller than 5b parameters?
Posted by Vegetable_Sun_9225@reddit | LocalLLaMA | View on Reddit | 130 comments
OcelotOk8071@reddit
Copyright protection method that has the "key" on both the distribution service server, and the actual copyright holder themselves, and only requires verification every 7 days or 30 days
Posted by dickcheney600@reddit | CrazyIdeas | View on Reddit | 19 comments
OcelotOk8071@reddit
Copyright protection method that has the "key" on both the distribution service server, and the actual copyright holder themselves, and only requires verification every 7 days or 30 days
Posted by dickcheney600@reddit | CrazyIdeas | View on Reddit | 19 comments
OcelotOk8071@reddit
Deepseek bitnet
Posted by Thistleknot@reddit | LocalLLaMA | View on Reddit | 52 comments
OcelotOk8071@reddit
Deepseek bitnet
Posted by Thistleknot@reddit | LocalLLaMA | View on Reddit | 52 comments
OcelotOk8071@reddit
Deepseek bitnet
Posted by Thistleknot@reddit | LocalLLaMA | View on Reddit | 52 comments
OcelotOk8071@reddit
LLMs like ChatGPT should nerf the output when people are being rude to it.
Posted by pastafarian24@reddit | CrazyIdeas | View on Reddit | 43 comments
OcelotOk8071@reddit
LLMs like ChatGPT should nerf the output when people are being rude to it.
Posted by pastafarian24@reddit | CrazyIdeas | View on Reddit | 43 comments
OcelotOk8071@reddit
LLMs like ChatGPT should nerf the output when people are being rude to it.
Posted by pastafarian24@reddit | CrazyIdeas | View on Reddit | 43 comments
OcelotOk8071@reddit
Who will release a new model in 2025 firstly?
Posted by foldl-li@reddit | LocalLLaMA | View on Reddit | 44 comments
OcelotOk8071@reddit
PoSe
Posted by Sufficient-Smile-481@reddit | LocalLLaMA | View on Reddit | 3 comments
OcelotOk8071@reddit
Restaurant called "The Birds Nest" where you order your food, the waiter eats it, then regurgitates it into your mouth.
Posted by n_thomas74@reddit | CrazyIdeas | View on Reddit | 16 comments
OcelotOk8071@reddit
Meta's Byte Latent Transformer (BLT) paper looks like the real-deal. Outperforming tokenization models even up to their tested 8B param model size. 2025 may be the year we say goodbye to tokenization.
Posted by jd_3d@reddit | LocalLLaMA | View on Reddit | 190 comments