google/gemma-4-12B · Hugging Face
Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 95 comments
Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on E2B, E4B, and 12B) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.
Featuring both Dense and Mixture-of-Experts (MoE) architectures, Gemma 4 is well-suited for tasks like text generation, coding, and reasoning. The models are available in five distinct sizes: **E2B**, **E4B**, **12B**, **26B A4B**, and **31B**. Their diverse sizes make them deployable in environments ranging from high-end phones to laptops and servers, democratizing access to state-of-the-art AI.
Gemma 4 introduces key **capability and architectural advancements**:
* **Reasoning** – All models in the family are designed as highly capable reasoners, with configurable thinking modes.
* **Extended Multimodalities** – Processes Text, Image with variable aspect ratio and resolution support (all models), Video, and Audio (featured natively on the E2B, E4B, and 12B models).
* **Diverse & Efficient Architectures** – Offers Dense and Mixture-of-Experts (MoE) variants of different sizes for scalable deployment.
* **Optimized for On-Device** – Smaller models are specifically designed for efficient local execution on laptops and mobile devices.
* **Increased Context Window** – The small models feature a 128K context window, while the medium models support 256K.
* **Enhanced Coding & Agentic Capabilities** – Achieves notable improvements in coding benchmarks alongside native function-calling support, powering highly capable autonomous agents.
* **Native System Prompt Support** – Gemma 4 introduces native support for the `system` role, enabling more structured and controllable conversations.
# [](https://huggingface.co/google/gemma-4-12B-it-assistant#models-overview)Models Overview
Gemma 4 models are designed to deliver frontier-level performance at each size, targeting deployment scenarios from mobile and edge devices (E2B, E4B) to consumer GPUs and workstations (12B, 26B A4B, 31B). They are well-suited for reasoning, agentic workflows, coding, and multimodal understanding.
The models employ a hybrid attention mechanism that interleaves local sliding window attention with full global attention, ensuring the final layer is always global. This hybrid design delivers the processing speed and low memory footprint of a lightweight model without sacrificing the deep awareness required for complex, long-context tasks. To optimize memory for long contexts, global layers feature unified Keys and Values, and apply Proportional RoPE (p-RoPE).
95 Comments
jacek2023@reddit (OP)
pmttyji@reddit
nixuelkty@reddit
pmttyji@reddit
krzyk@reddit
pmttyji@reddit
jacek2023@reddit (OP)
pmttyji@reddit
Danmoreng@reddit
StaysAwakeAllWeek@reddit
MaruluVR@reddit
AppealThink1733@reddit
silenceimpaired@reddit
Melbar666@reddit
ZdzisiuFryta@reddit
No_Lingonberry1201@reddit
jacek2023@reddit (OP)
FormerPassenger1558@reddit
arbv@reddit
grudev@reddit
Few_Painter_5588@reddit
jacek2023@reddit (OP)
arbv@reddit
M4GMaR@reddit
uhuge@reddit
Adventurous-Paper566@reddit
windows_error23@reddit
unknowntoman-1@reddit
annodomini@reddit
jacek2023@reddit (OP)
BoogerheadCult@reddit
seamonn@reddit
BoogerheadCult@reddit
seamonn@reddit
deathacus12@reddit
annodomini@reddit
WithoutReason1729@reddit
seamonn@reddit
Clean_Hyena7172@reddit
jld1532@reddit
uhuge@reddit
Clean_Hyena7172@reddit
jacek2023@reddit (OP)
seamonn@reddit
arbv@reddit
Toastti@reddit
Clean_Hyena7172@reddit
jacek2023@reddit (OP)
srivatsasrinivasmath@reddit
bonobomaster@reddit
nullbyte420@reddit
DedsPhil@reddit
jacek2023@reddit (OP)
stddealer@reddit
hackerllama@reddit
dampflokfreund@reddit
Opening-Broccoli9190@reddit
Tyrannas@reddit
jacek2023@reddit (OP)
Hoak-em@reddit
siegevjorn@reddit
Final-Rush759@reddit
MaartenGr@reddit
seamonn@reddit
Valuable_Touch5670@reddit
bonobomaster@reddit
error_museum@reddit
nickless07@reddit
error_museum@reddit
MarkoMarjamaa@reddit
nickless07@reddit
Guilty_Rooster_6708@reddit
Hydroskeletal@reddit
mechasquare@reddit
arbv@reddit
ea_man@reddit
EcstaticDentist@reddit
Valuable_Touch5670@reddit
M4GMaR@reddit
false79@reddit
annodomini@reddit
Hanthunius@reddit
HornyGooner4402@reddit
Temporary-Roof2867@reddit
BitGreen1270@reddit
ttkciar@reddit
false79@reddit
alex20_202020@reddit
jacek2023@reddit (OP)
larrytheevilbunnie@reddit
Jealous-Astronaut457@reddit
jacek2023@reddit (OP)
Eyelbee@reddit
seamonn@reddit
jacek2023@reddit (OP)