Qwen 3 4B
by Alibaba · qwen-3 family
4B
parameters
text-generation code-generation reasoning multilingual math summarization
Qwen 3 4B is a compact dense model with hybrid thinking mode — it can answer directly for simple questions or engage step-by-step reasoning for complex tasks. Supports 29+ languages and 128K context. At Q4 it fits easily on any 8 GB GPU or Mac, making it an excellent lightweight daily driver. Punches well above its weight on reasoning and math benchmarks compared to similarly sized models.
Quick Start with Ollama
ollama run 4b-q4_K_M | Creator | Alibaba |
| Parameters | 4B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Apr 29, 2025 |
| License | Apache 2.0 |
| Ollama | qwen3:4b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 2.5 GB | 4.5 GB | | 4b-q4_K_M |
| Q8_0 | 4.5 GB | 6.5 GB | | 4b-q8_0 |
| F16 | 8.2 GB | 11 GB | | 4b-fp16 |
Compatible Hardware
Q4_K_M requires 4.5 GB VRAM
Benchmark Scores
65.0
mmlu