Qwen 3 8B
by Alibaba · qwen-3 family
8B
parameters
text-generation code-generation reasoning multilingual math tool-use summarization
Qwen 3 8B is the workhorse of the Qwen 3 dense lineup, offering an excellent balance of capability and resource efficiency. Features hybrid thinking mode for adaptive reasoning depth and supports tool calling for agentic workflows. At Q4 it fits on 8 GB GPUs with some headroom, and runs comfortably on 12-16 GB hardware. Strong at coding, math, and multilingual tasks — a direct upgrade over Llama 3.1 8B in most benchmarks.
Quick Start with Ollama
ollama run 8b-q4_K_M | Creator | Alibaba |
| Parameters | 8B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Apr 29, 2025 |
| License | Apache 2.0 |
| Ollama | qwen3:8b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 5.2 GB | 7.5 GB | | 8b-q4_K_M |
| Q8_0 | 8.9 GB | 11.5 GB | | 8b-q8_0 |
| F16 | 16.5 GB | 20 GB | | 8b-fp16 |
Compatible Hardware
Q4_K_M requires 7.5 GB VRAM
Benchmark Scores
73.5
mmlu