Llama 3.2 1B

by Meta · llama-3 family

1B

parameters

text-generation summarization

Llama 3.2 1B is the smallest model in the Llama 3.2 family, designed for ultra-lightweight deployment scenarios. It can handle basic text generation and summarization tasks while requiring minimal compute resources. This model is best suited for simple tasks, prototyping, or situations where hardware is extremely constrained. It runs on virtually any modern device and provides fast inference even on CPU-only setups.

Quick Start with Ollama

ollama run 1b-instruct-q8_0
Creator Meta
Parameters 1B
Architecture transformer-decoder
Context Length 128K tokens
License Llama 3.2 Community License
Released Sep 25, 2024
Ollama llama3.2:1b

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M 0.8 GB 2.1 GB
1b-instruct-q4_K_M
Q8_0 recommended 0.9 GB 3 GB
1b-instruct-q8_0
F16 1.9 GB 4 GB
1b-instruct-fp16

Compatible Hardware for Q8_0

Showing compatibility for the recommended quantization (Q8_0, 3 GB VRAM).

Benchmark Scores

49.3
mmlu