Gemma 3 4B
by Google · gemma-3 family
4B
parameters
text-generation code-generation reasoning multilingual vision summarization
Gemma 3 4B is the smallest multimodal model in the Gemma 3 family, supporting both text and image inputs. It delivers impressive performance for its size, outperforming many larger text-only models on standard benchmarks. With 128K context and vision support at just 5 GB VRAM (Q4), it's an excellent choice for users with 8 GB GPUs who want multimodal capabilities. Drag images into Ollama's desktop app to ask questions about them.
Quick Start with Ollama
ollama run 4b-it-q4_K_M | Creator | |
| Parameters | 4B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Mar 12, 2025 |
| License | Gemma Terms of Use |
| Ollama | gemma3:4b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 3.3 GB | 5 GB | | 4b-it-q4_K_M |
| Q8_0 | 5 GB | 7.5 GB | | 4b-it-q8_0 |
| F16 | 8.6 GB | 11.5 GB | | 4b-it-fp16 |
Compatible Hardware
Q4_K_M requires 5 GB VRAM
Benchmark Scores
62.0
mmlu