Qwen 2.5 Coder 32B
by Alibaba · qwen-2.5 family
32B
parameters
text-generation code-generation reasoning
Qwen 2.5 Coder 32B is the flagship of the Qwen 2.5 Coder series and one of the strongest open-source coding models available. It rivals GPT-4o on code generation benchmarks and supports 128K context for working with large codebases. At Q4 it needs about 23 GB VRAM — fits on a RTX 3090/5090 or Mac with 24 GB+ unified memory. The go-to choice for developers who want the best possible local coding assistant and have the hardware to support it.
Quick Start with Ollama
ollama run 32b-instruct-q4_K_M | Creator | Alibaba |
| Parameters | 32B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Nov 12, 2024 |
| License | Apache 2.0 |
| Ollama | qwen2.5-coder:32b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 20 GB | 23 GB | | 32b-instruct-q4_K_M |
| Q8_0 | 34.5 GB | 39 GB | | 32b-instruct-q8_0 |
| F16 | 65 GB | 70 GB | | 32b-instruct-fp16 |
Compatible Hardware
Q4_K_M requires 23 GB VRAM
Benchmark Scores
78.0
mmlu