Llama 3.2 3B
by Meta · llama-3 family
3B
parameters
text-generation code-generation multilingual summarization
Llama 3.2 3B is a lightweight model from Meta designed for edge deployment and on-device inference. Despite its small size, it delivers surprisingly capable performance for text generation, summarization, and basic coding tasks. This model is ideal for users with limited hardware who still want a capable assistant. It runs comfortably on most modern laptops and even some mobile devices, making it one of the most accessible models in the Llama family.
Quick Start with Ollama
ollama run 3b-instruct-q8_0 | Creator | Meta |
| Parameters | 3B |
| Architecture | transformer-decoder |
| Context Length | 128K tokens |
| License | Llama 3.2 Community License |
| Released | Sep 25, 2024 |
| Ollama | llama3.2 |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M | 1.8 GB | 3.3 GB |
★
★
★
★
★
| 3b-instruct-q4_K_M |
| Q8_0 recommended | 2.7 GB | 5 GB |
★
★
★
★
★
| 3b-instruct-q8_0 |
| F16 | 5.7 GB | 8 GB |
★
★
★
★
★
| 3b-instruct-fp16 |
Compatible Hardware for Q8_0
Showing compatibility for the recommended quantization (Q8_0, 5 GB VRAM).
Compatible Hardware
Benchmark Scores
63.4
mmlu