inclusionAI/Ling-2.6-1T · Hugging Face
Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 15 comments
Ling-2.6-1T: A Trillion-Parameter Comprehensive Flagship Model for Complex Tasks
Today, we are thrilled to open-source Ling–2.6–1T from the Ling family.
Tailored for real–world, complex scenarios, this trillion–parameter model introduces targeted optimizations across inference efficiency, token overhead, and agentic capabilities, making it highly effective for coding and daily workflows.
Key upgrades in Ling–2.6–1T include:
- High Inference Efficiency: By adopting a hybrid architecture combining MLA and Linear Attention, we dramatically reduce latency and VRAM footprint for long contexts. It delivers superior throughput and lower per–token computational costs without sacrificing expressivity, ensuring real–time responsiveness for complex reasoning and tool calling.
- Lower Token Overhead via "Fast Thinking": We introduce a Contextual Process Redundancy Suppression reward strategy during post–training. This reduces reliance on verbose chains–of–thought (CoT), utilizing a "fast thinking" mechanism to reach answers directly and compress output costs while maintaining top–tier intelligence.
- Reliable Multi–Step Execution: With enhanced reasoning, agentic coding, and instruction following, Ling–2.6–1T achieves open–source SOTA on execution–heavy benchmarks, including AIME26, SWE–bench Verified, BFCL–V4, TAU2–Bench, and IFBench.
- Production–Ready for Agent Workflows: Designed for end–to–end engineering—from code generation to bug fixing—Ling–2.6–1T integrates seamlessly with mainstream agent frameworks like Claude Code, OpenClaw, OpenCode, and CodeBuddy, effortlessly handling multi–tool, multi–step constraints in enterprise environments.
unbannedfornothing@reddit
Damn, do they know any other numbers than 1 trillion?
pmttyji@reddit (OP)
A day ago, they released 100B sized model
https://huggingface.co/inclusionAI/Ling-2.6-flash
Last year, they released 17B sized model called Ling-Mini. Unfortunately they skipped it during last version & also now I think
unbannedfornothing@reddit
I've meant globally, Kimi, Mimo, Deepseek and now (again) Ling.
nullmove@reddit
DeepSeek is like 60% bigger than 1T
Tall-Ad-7742@reddit
no?
nullmove@reddit
Have you ever heard of quantisation? This model is natively in mixed precision, MoE experts are in FP4 and the rest in FP8. HuggingFace probably does naive size based total calculation hence it's broken.
If you could be arsed to literally click on that link and eyeball the README (very difficult apparently), you could clearly see it spelled out that it's a 1.6T param model. But what do DeepSeek know about their own model, eh?
Tall-Ad-7742@reddit
Oh if that is a mistake on hugging face then sorry I thought they did the same thing like Kimi did where they I think quantiziesed the model from the beginning
pmttyji@reddit (OP)
Agree. At least, Mimo, Deepseek, Ling released additional model apart from 1T 😃
I really want to see more 100-200B size models.
nuclearbananana@reddit
Might be the best non-thinking model, not the best overall
pmttyji@reddit (OP)
They'll be releasing thinking ones(called Ring-2.6-1T & Ring-2.6-flash) soon or later
Tall-Ad-7742@reddit
oh i really hope
KickLassChewGum@reddit
This... is not a great model. I've had it throw together a quick & simple HFTransformers-based inference script and it completely bungled it, hallucinated a bunch of non-existent config flags, wrote 250 lines of dead code, and added a comment that it was "tested & working."
Gemma 4 31B wrote 40 lines and nailed it.
Inside-Chance-320@reddit
The benchmarks compares against old model's. GLM-5, Deepseek 3.2, Kimi 2.5 and so on.
Hodler-mane@reddit
its fockin raining models!
LatentSpacer@reddit
Hallelujah!