ninjasaid13

gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint

Posted by fulgencio_batista@reddit | LocalLLaMA | View on Reddit | 151 comments

Qwen cant wait to release 3.7 models

Posted by GotHereLateNameTaken@reddit | LocalLLaMA | View on Reddit | 276 comments

I Let a Small Model Train on Its Own Mistakes. It Reached 80% on HumanEval and Beat GPT-3.5 on Math

Posted by QuantumSeeds@reddit | LocalLLaMA | View on Reddit | 59 comments

<thinking></thinking>

Posted by Comfortable-Rock-498@reddit | LocalLLaMA | View on Reddit | 89 comments

<thinking></thinking>

Posted by Comfortable-Rock-498@reddit | LocalLLaMA | View on Reddit | 89 comments

Gemma 4 MTP released

Posted by rerri@reddit | LocalLLaMA | View on Reddit | 301 comments

<thinking></thinking>

Posted by Comfortable-Rock-498@reddit | LocalLLaMA | View on Reddit | 89 comments

<thinking></thinking>

Posted by Comfortable-Rock-498@reddit | LocalLLaMA | View on Reddit | 89 comments

ninjasaid13@reddit

>Gemini 3.1 Pro, Claude 4.7 Opus have the capacity to reason at a PhD level of a given field given the person doing the prompts is also highly skilled in the field to be able to give clear instructions and maybe provide grounding sources. They know everything but understand nothing.

<thinking></thinking>

Posted by Comfortable-Rock-498@reddit | LocalLLaMA | View on Reddit | 89 comments

ninjasaid13@reddit

>Aren't LLMs already statistically smarter than a majority of humans? At answering new questions, they are knowledgeable, at creating new questions, no.

<thinking></thinking>

Posted by Comfortable-Rock-498@reddit | LocalLLaMA | View on Reddit | 89 comments

Best Local LLMs - Apr 2026

Posted by rm-rf-rm@reddit | LocalLLaMA | View on Reddit | 365 comments

Embracing the noise: How to build an agent that is both neuro-symbolic and probabilistic.

Posted by DepthOk4115@reddit | LocalLLaMA | View on Reddit | 10 comments

Decreased Intelligence Density in DeepSeek V4 Pro

Posted by Mindless_Pain1860@reddit | LocalLLaMA | View on Reddit | 90 comments

r/LocalLLaMa Rule Updates

Posted by rm-rf-rm@reddit | LocalLLaMA | View on Reddit | 121 comments

I made a tiny world model game that runs locally on iPad

Posted by howthefrondsfold@reddit | LocalLLaMA | View on Reddit | 27 comments

Qwen-Image-2.0 is out - 7B unified gen+edit model with native 2K and actual text rendering

Posted by RIPT1D3_Z@reddit | LocalLLaMA | View on Reddit | 120 comments

Meta to open source versions of its next AI models

Posted by abkibaarnsit@reddit | LocalLLaMA | View on Reddit | 62 comments

Mistral AI to release Voxtral TTS, a 3-billion-parameter text-to-speech model with open weights that the company says outperformed ElevenLabs Flash v2.5 in human preference tests. The model runs on about 3 GB of RAM, achieves 90-millisecond time-to-first-audio, supports nine languages.

Posted by Nunki08@reddit | LocalLLaMA | View on Reddit | 186 comments

Introducing ARC-AGI-3

Posted by Complete-Sea6655@reddit | LocalLLaMA | View on Reddit | 100 comments

Introducing ARC-AGI-3

Posted by Complete-Sea6655@reddit | LocalLLaMA | View on Reddit | 100 comments

ninjasaid13@reddit

>What I find interesting about AGI-3 is that it shifts the evaluation unit from 'can it solve this task' to 'how efficiently does it acquire the skill.' That's a much harder thing to fake. You can brute force a benchmark. You can't brute force a learning curve. Exactly, I had this idea for a while for a benchmark.

High school student seeking advice: Found an architectural breakthrough that scales a 17.6B model down to 417M?

Posted by Appropriate-Scar3116@reddit | LocalLLaMA | View on Reddit | 210 comments

Qwen3.5B VS the SOTA same size models from 2 years ago.

Posted by Uncle___Marty@reddit | LocalLLaMA | View on Reddit | 59 comments

Qwen 2.5 -> 3 -> 3.5, smallest models. Incredible improvement over the generations.

Posted by airbus_a360_when@reddit | LocalLLaMA | View on Reddit | 136 comments

Qwen3.5-397B-A17B-UD-TQ1 bench results FW Desktop Strix Halo 128GB

Posted by dabiggmoe2@reddit | LocalLLaMA | View on Reddit | 58 comments

ninjasaid13@reddit

>Qwen3.5-397B-A17B-UD-TQ1 https://preview.redd.it/2vjkv17ufglg1.png?width=1633&format=png&auto=webp&s=ba23ec946d34a5ea70be82399adc606d4c872ab8

meanwhile in China

Posted by Tiny_Judge_2119@reddit | LocalLLaMA | View on Reddit | 33 comments

ninjasaid13@reddit

>Good, and a new version which should match or exceed Seedance 2.0's capabilities should release within a month or two. If it was within a month or two of releasing, it would be demoed.

How I mapped every High Court of Australia case and their citations (1901-2025)

Posted by Neon0asis@reddit | LocalLLaMA | View on Reddit | 6 comments

Pack it up guys, open weight AI models running offline locally on PCs aren't real. 😞

Posted by CesarOverlorde@reddit | LocalLLaMA | View on Reddit | 294 comments

Anthropic is deploying 20M$ to support AI regulation in sight of 2026 elections

Posted by 1998marcom@reddit | LocalLLaMA | View on Reddit | 81 comments

ninjasaid13@reddit

An LLM isn't walking a non-technical user through anything if they don't have the basic underlying technical knowledge and wouldn't even know if the AI was "hallucinating" a dangerous or impossible step in a protocol. I don't think this has changed at all in several years. Research consistently shows that AI is an assistive tool, it doesn't grant tacit knowledge to the end user even as the AI starts to become more knowledgeable on virology than experts. The internet provides a what, an LLM might provide a better what, but none of them provide a how.

Anthropic is deploying 20M$ to support AI regulation in sight of 2026 elections

Posted by 1998marcom@reddit | LocalLLaMA | View on Reddit | 81 comments

Qwen-Image-2.0 is out - 7B unified gen+edit model with native 2K and actual text rendering

Posted by RIPT1D3_Z@reddit | LocalLLaMA | View on Reddit | 120 comments

Qwen-Image-2.0 is out - 7B unified gen+edit model with native 2K and actual text rendering

Posted by RIPT1D3_Z@reddit | LocalLLaMA | View on Reddit | 120 comments

I'm playing telephone pictionary with LLMs, VLMs, SDs, and Kokoro on my Strix Halo

Posted by jfowers_amd@reddit | LocalLLaMA | View on Reddit | 9 comments

Unsloth just unleashed Glm 5! GGUF NOW!

Posted by RickyRickC137@reddit | LocalLLaMA | View on Reddit | 82 comments

New stealth model: Pony Alpha

Posted by sirjoaco@reddit | LocalLLaMA | View on Reddit | 30 comments

New stealth model: Pony Alpha

Posted by sirjoaco@reddit | LocalLLaMA | View on Reddit | 30 comments

We built an 8B world model that beats 402B Llama 4 by generating web code instead of pixels — open weights on HF

Posted by jshin49@reddit | LocalLLaMA | View on Reddit | 46 comments

Fei Fei Li dropped a non-JEPA world model, and the spatial intelligence is insane

Posted by coloradical5280@reddit | LocalLLaMA | View on Reddit | 90 comments

Fei Fei Li dropped a non-JEPA world model, and the spatial intelligence is insane

Posted by coloradical5280@reddit | LocalLLaMA | View on Reddit | 90 comments

Fei Fei Li dropped a non-JEPA world model, and the spatial intelligence is insane

Posted by coloradical5280@reddit | LocalLLaMA | View on Reddit | 90 comments

ninjasaid13@reddit

>And the point is that this does NOT uses triangles or vertices. Nor does my paper >Papers aren't products So the only difference is that it's a product? Is being a product supposed to be what makes it revolutionary?

Fei Fei Li dropped a non-JEPA world model, and the spatial intelligence is insane

Posted by coloradical5280@reddit | LocalLLaMA | View on Reddit | 90 comments

Fei Fei Li dropped a non-JEPA world model, and the spatial intelligence is insane

Posted by coloradical5280@reddit | LocalLLaMA | View on Reddit | 90 comments

GLM-Image is released!

Posted by foldl-li@reddit | LocalLLaMA | View on Reddit | 82 comments

The current state of sparse-MoE's for agentic coding work (Opinion)

Posted by ForsookComparison@reddit | LocalLLaMA | View on Reddit | 80 comments

Apple introduces SHARP, a model that generates a photorealistic 3D Gaussian representation from a single image in seconds.

Posted by themixtergames@reddit | LocalLLaMA | View on Reddit | 140 comments

Apple introduces SHARP, a model that generates a photorealistic 3D Gaussian representation from a single image in seconds.

Posted by themixtergames@reddit | LocalLLaMA | View on Reddit | 140 comments

Basketball AI with RF-DETR, SAM2, and SmolVLM2

Posted by RandomForests92@reddit | LocalLLaMA | View on Reddit | 48 comments

WTF! Is this real? Teenagers are building AGI Research Lab

Posted by Illustrious-Yak-9195@reddit | LocalLLaMA | View on Reddit | 16 comments

When you figure out it’s all just math:

Posted by Current-Ticket4214@reddit | LocalLLaMA | View on Reddit | 381 comments

ninjasaid13@reddit

Your brain uses electric charge and a calculator uses electric charge, does that mean that your brain is not in contradiction to a calculator? >And we do know what is AGI, and its criterias We do not have any besides defining it in terms of human intelligence. > Ok tell me if it is not based predictive processing and attention processing? This doesn't mean LLMs have human-like thinking. The LLM predicts the most probable next token by learning statistical language while humans are not based on token or language at all. What humans predict is the state of the world. If anyone tells you that LLMs have a world model or that video generators have a world model, they're sorely mistaken about what a world model is. This is what a real world model requires: [https://en.wikipedia.org/wiki/Schema\_(psychology)](https://en.wikipedia.org/wiki/Schema_(psychology))

When you figure out it’s all just math:

Posted by Current-Ticket4214@reddit | LocalLLaMA | View on Reddit | 381 comments

Microsoft’s AI Scientist

Posted by Ok-Breakfast-4676@reddit | LocalLLaMA | View on Reddit | 36 comments