Gemma 3 12B
by Google · gemma-3 family
12B
parameters
text-generation code-generation reasoning multilingual vision math summarization
Gemma 3 12B is the sweet spot of the Gemma 3 family — multimodal, 128K context, and strong enough to compete with models twice its size. It's one of the most popular models on Ollama with tens of millions of pulls. At Q4, it fits comfortably on 12-16 GB GPUs and delivers excellent results for conversation, coding, reasoning, and image understanding. A strong all-rounder for anyone with mid-range hardware.
Quick Start with Ollama
ollama run 12b-it-q4_K_M | Creator | |
| Parameters | 12B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Mar 12, 2025 |
| License | Gemma Terms of Use |
| Ollama | gemma3:12b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 8.1 GB | 10.5 GB | | 12b-it-q4_K_M |
| Q8_0 | 13 GB | 16 GB | | 12b-it-q8_0 |
| F16 | 24 GB | 28 GB | | 12b-it-fp16 |
Compatible Hardware
Q4_K_M requires 10.5 GB VRAM
Benchmark Scores
76.0
mmlu