JSVD2
BeeLlama v0.3.1 – latest llama.cpp with extras! DFlash, MTP, q6_0 cache, TurboQuant. Single RTX 3090: Qwen 3.6 27B & Gemma 4 31B up to 177.8 tps (4.93x over baseline)
Posted by Anbeeld@reddit | LocalLLaMA | View on Reddit | 10 comments
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
Nemotron 3 Ultra reality check: no one-box 128GB GGUF route yet; Nemotron 3 Nano runs at 66.6 t/s on Strix Halo
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 34 comments
JSVD2@reddit (OP)
NVIDIA Nemotron 3 Ultra is out.
Posted by justdoitanddont@reddit | LocalLLaMA | View on Reddit | 1 comments
JSVD2@reddit
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 · Hugging Face
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 138 comments
JSVD2@reddit
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 · Hugging Face
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 138 comments
JSVD2@reddit
llama.cpp - Qwen3.6/3.5-MTP - Share your benchmarks t/s
Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 40 comments
JSVD2@reddit
gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint
Posted by fulgencio_batista@reddit | LocalLLaMA | View on Reddit | 146 comments
JSVD2@reddit
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
I found what I was looking for in Qwen 3.7.
Posted by CosmicRiver827@reddit | LocalLLaMA | View on Reddit | 9 comments
JSVD2@reddit
Strix Halo 128Gb: what models, which quants are optimal?
Posted by DevelopmentBorn3978@reddit | LocalLLaMA | View on Reddit | 50 comments
JSVD2@reddit
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
Direct 100.0 t/s on Strix Halo with Qwen3 30B-A3B. Can anyone reproduce or beat this?
Posted by JSVD2@reddit | LocalLLaMA | View on Reddit | 23 comments
JSVD2@reddit (OP)
Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything)
Posted by OttoRenner@reddit | LocalLLaMA | View on Reddit | 349 comments
JSVD2@reddit
Shoutout to Gemma4 as a conversational assistant / agent
Posted by goldcakes@reddit | LocalLLaMA | View on Reddit | 66 comments
JSVD2@reddit
1-bit Bonsai Image 4B and Ternary Bonsai Image 4B Image Generation for Local Devices with just 0.93 GB and 1.21 GB respectively of Diffusion Transformer Footprint. So tiny!
Posted by Addyad@reddit | LocalLLaMA | View on Reddit | 16 comments
JSVD2@reddit
I have become George Jetson: my job is now Yes/No supervision for a machine I don’t fully understand.
Posted by Helpful_Today7449@reddit | LocalLLaMA | View on Reddit | 72 comments
JSVD2@reddit
what do you use your local llm?
Posted by FormalAd7367@reddit | LocalLLaMA | View on Reddit | 39 comments
JSVD2@reddit
Qwen3.6 35B-A3B successfully completed the FoodTruck Bench!
Posted by PulseVector@reddit | LocalLLaMA | View on Reddit | 20 comments
JSVD2@reddit
Putting together a pc. Are my assumptions correct?
Posted by Competitive_Wait_267@reddit | LocalLLaMA | View on Reddit | 17 comments
JSVD2@reddit
Putting together a pc. Are my assumptions correct?
Posted by Competitive_Wait_267@reddit | LocalLLaMA | View on Reddit | 17 comments
JSVD2@reddit
DIY Local 2x DGX Spark cluster cooler with automatic temperature controlled fan.
Posted by Porespellar@reddit | LocalLLaMA | View on Reddit | 6 comments
JSVD2@reddit
My home data center
Posted by alecKarfonta@reddit | LocalLLaMA | View on Reddit | 86 comments
JSVD2@reddit
125 tok/s for Qwen3.6 q4xl on 2x 4060ti is insane perf/dollar
Posted by Chuyito@reddit | LocalLLaMA | View on Reddit | 100 comments
JSVD2@reddit
What memory system are you using for your agents?
Posted by Mr_Moonsilver@reddit | LocalLLaMA | View on Reddit | 60 comments
JSVD2@reddit
My new home office radiator 🥵
Posted by lantern_lol@reddit | LocalLLaMA | View on Reddit | 73 comments
JSVD2@reddit
DolphinGemma release when?
Posted by Environmental-Metal9@reddit | LocalLLaMA | View on Reddit | 12 comments
JSVD2@reddit
How do you prove an open model actually improved?
Posted by tonyblu331@reddit | LocalLLaMA | View on Reddit | 22 comments
JSVD2@reddit
So qwen3.7-4b when?
Posted by ab2377@reddit | LocalLLaMA | View on Reddit | 47 comments
JSVD2@reddit
So qwen3.7-4b when?
Posted by ab2377@reddit | LocalLLaMA | View on Reddit | 47 comments