tomz17
Llama.ccp
Posted by Pancake502@reddit | LocalLLaMA | View on Reddit | 22 comments
MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal
Posted by dryadofelysium@reddit | LocalLLaMA | View on Reddit | 220 comments
tomz17@reddit
Someone hid a full RAT inside a fake npm package and exfiltrated victim data to HuggingFace
Posted by BattleRemote3157@reddit | programming | View on Reddit | 101 comments
tomz17@reddit
Qwen3.5 27B running at ~65tps with DFlash speculation on 2x 3090
Posted by Kryesh@reddit | LocalLLaMA | View on Reddit | 21 comments
tomz17@reddit
Llama.cpp MTP with Qwen3.6 27B on Headless RTX 3090
Posted by cleversmoke@reddit | LocalLLaMA | View on Reddit | 44 comments
tomz17@reddit
Llama.cpp MTP with Qwen3.6 27B on Headless RTX 3090
Posted by cleversmoke@reddit | LocalLLaMA | View on Reddit | 44 comments
tomz17@reddit
Llama.cpp MTP with Qwen3.6 27B on Headless RTX 3090
Posted by cleversmoke@reddit | LocalLLaMA | View on Reddit | 44 comments
tomz17@reddit
That's a good news...
Posted by Pjotrs@reddit | LocalLLaMA | View on Reddit | 244 comments
tomz17@reddit
how would you set up a local llm server for a business of 7 people?
Posted by snowieslilpikachu69@reddit | LocalLLaMA | View on Reddit | 58 comments
tomz17@reddit
Multi-Token Prediction (MTP) for Qwen on LLaMA.cpp + TurboQuant
Posted by gladkos@reddit | LocalLLaMA | View on Reddit | 83 comments
tomz17@reddit
Web-Search is coming to a screeching performance halt as Google shuts down their free search index, and traffic defenders like Cloudflare challenge AI at every gateway. What are our options?
Posted by NetTechMan@reddit | LocalLLaMA | View on Reddit | 238 comments
tomz17@reddit
What do I do if I accidentally put regular gas in a premium car?
Posted by Street_Firefighter_7@reddit | askcarguys | View on Reddit | 303 comments
tomz17@reddit
What do I do if I accidentally put regular gas in a premium car?
Posted by Street_Firefighter_7@reddit | askcarguys | View on Reddit | 303 comments
tomz17@reddit
What do I do if I accidentally put regular gas in a premium car?
Posted by Street_Firefighter_7@reddit | askcarguys | View on Reddit | 303 comments
tomz17@reddit
[Paper on Hummingbird+: low-cost FPGAs for LLM inference] Qwen3-30B-A3B Q4 at 18 t/s token-gen, 24GB, expected $150 mass production cost
Posted by ayake_ayake@reddit | LocalLLaMA | View on Reddit | 56 comments
tomz17@reddit
Follow-up: Qwen3.6-27B on 1× RTX 3090 — pushing to ~218K context + ~50–66 TPS, tool calls now stable (PN12 fix)
Posted by AmazingDrivers4u@reddit | LocalLLaMA | View on Reddit | 66 comments
tomz17@reddit
Follow-up: Qwen3.6-27B on 1× RTX 3090 — pushing to ~218K context + ~50–66 TPS, tool calls now stable (PN12 fix)
Posted by AmazingDrivers4u@reddit | LocalLLaMA | View on Reddit | 66 comments
tomz17@reddit
Deepseek v4 pricing is genuinely silly, did the math and now i am questioning my entire stack
Posted by Skid_gates_99@reddit | LocalLLaMA | View on Reddit | 77 comments
tomz17@reddit
Opus 4.7 Max subscriber. Switching to Kimi 2.6
Posted by meaningego@reddit | LocalLLaMA | View on Reddit | 106 comments
tomz17@reddit
Those of you running minimax 2.7 locally, how are you feeling about it?
Posted by laterbreh@reddit | LocalLLaMA | View on Reddit | 129 comments
tomz17@reddit
MiniMax-M2.7's MIT-Style License Is a Misleading Restriction That Bans Commercial Use and Fails Free Software Standards
Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 34 comments
tomz17@reddit
MiniMax-M2.7's MIT-Style License Is a Misleading Restriction That Bans Commercial Use and Fails Free Software Standards
Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 34 comments
tomz17@reddit
MiniMax-M2.7's MIT-Style License Is a Misleading Restriction That Bans Commercial Use and Fails Free Software Standards
Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 34 comments
tomz17@reddit
MiniMax-M2.7 NVFP4 on 2x RTX PRO 6000 Blackwell — bench numbers
Posted by Visual_Synthesizer@reddit | LocalLLaMA | View on Reddit | 19 comments
tomz17@reddit
MiniMax-M2.7 GGUF Quants — Full Set (Q2_K to Q8_0 + BF16)
Posted by Asleep_Training3543@reddit | LocalLLaMA | View on Reddit | 21 comments
tomz17@reddit
MiniMax-M2.7 GGUF Quants — Full Set (Q2_K to Q8_0 + BF16)
Posted by Asleep_Training3543@reddit | LocalLLaMA | View on Reddit | 21 comments
tomz17@reddit
MiniMax M2.7 is NOT open source - DOA License :(
Posted by KvAk_AKPlaysYT@reddit | LocalLLaMA | View on Reddit | 229 comments
tomz17@reddit
DFlash: Block Diffusion for Flash Speculative Decoding.
Posted by Total-Resort-3120@reddit | LocalLLaMA | View on Reddit | 127 comments
tomz17@reddit
Qwen3.5 27B running at ~65tps with DFlash speculation on 2x 3090
Posted by Kryesh@reddit | LocalLLaMA | View on Reddit | 21 comments
tomz17@reddit
What it took to launch Google DeepMind's Gemma 4
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 136 comments
tomz17@reddit
Claude Code's source just leaked — I extracted its multi-agent orchestration system into an open-source framework that works with any LLM
Posted by JackChen02@reddit | LocalLLaMA | View on Reddit | 317 comments
tomz17@reddit
Claude Code's source just leaked — I extracted its multi-agent orchestration system into an open-source framework that works with any LLM
Posted by JackChen02@reddit | LocalLLaMA | View on Reddit | 317 comments
tomz17@reddit
local llm inference on M4 Max vs M5 Max
Posted by purealgo@reddit | LocalLLaMA | View on Reddit | 2 comments
tomz17@reddit
Kimi K2.6 will drop in the next 2 weeks, K3 is WIP and will be huge
Posted by No-Thought-4995@reddit | LocalLLaMA | View on Reddit | 68 comments
tomz17@reddit
vLLM First timer 3090 + 3090Ti with Qwen 3.5 27b Q4
Posted by edankwan@reddit | LocalLLaMA | View on Reddit | 9 comments
tomz17@reddit
dGPU gang we're so back
Posted by ForsookComparison@reddit | LocalLLaMA | View on Reddit | 41 comments
tomz17@reddit
I understand the disappointment if minimax 2.7 does not become open weights but we have had a lot..
Posted by LegacyRemaster@reddit | LocalLLaMA | View on Reddit | 28 comments
tomz17@reddit
I understand the disappointment if minimax 2.7 does not become open weights but we have had a lot..
Posted by LegacyRemaster@reddit | LocalLLaMA | View on Reddit | 28 comments
tomz17@reddit
I understand the disappointment if minimax 2.7 does not become open weights but we have had a lot..
Posted by LegacyRemaster@reddit | LocalLLaMA | View on Reddit | 28 comments
tomz17@reddit
This is incredibly tempting
Posted by No_Mango7658@reddit | LocalLLaMA | View on Reddit | 115 comments
tomz17@reddit
Glm 5.1 👀
Posted by Namra_7@reddit | LocalLLaMA | View on Reddit | 99 comments
tomz17@reddit
CLI coding client - alternative to (not so) OpenCode
Posted by momsi91@reddit | LocalLLaMA | View on Reddit | 22 comments
tomz17@reddit
DeepSeek just called itself Claude mid-convo… what?? 💀
Posted by Annual_Point7199@reddit | LocalLLaMA | View on Reddit | 9 comments
tomz17@reddit
Nvidia P4000, i need some help
Posted by prxy15@reddit | LocalLLaMA | View on Reddit | 11 comments
tomz17@reddit
Nvidia P4000, i need some help
Posted by prxy15@reddit | LocalLLaMA | View on Reddit | 11 comments
tomz17@reddit
How’d I do?
Posted by No_Development5871@reddit | LocalLLaMA | View on Reddit | 5 comments
tomz17@reddit
Dual 3090s (power-limited) - Are 3x PCI-E cables w/daisy-chain "okay?"
Posted by overand@reddit | LocalLLaMA | View on Reddit | 17 comments
tomz17@reddit
Chonkers and thermals (dual 3090)
Posted by BetStack@reddit | LocalLLaMA | View on Reddit | 16 comments
tomz17@reddit
Chonkers and thermals (dual 3090)
Posted by BetStack@reddit | LocalLLaMA | View on Reddit | 16 comments
tomz17@reddit
Chonkers and thermals (dual 3090)
Posted by BetStack@reddit | LocalLLaMA | View on Reddit | 16 comments