Skip to content

Llama 4 Scout (109B/17B active)

by Meta · llama-4 family

109B

parameters

text-generation code-generation reasoning multilingual vision math tool-use creative-writing summarization

Llama 4 Scout is Meta's mixture-of-experts model with 109B total parameters but only 17B active per token across 16 experts. It's natively multimodal (text + images) and supports an unprecedented 10M token context window. At Q4 it needs about 72 GB — too large for a single consumer GPU but fits on Macs with 96-128 GB unified memory, or multi-GPU setups. Despite the large memory footprint, inference speed benefits from only 17B active params. The most capable open-weight model from Meta.

Quick Start with Ollama

ollama run scout-q4_K_M
Resources Ollama Hugging Face Official Page
Creator Meta
Parameters 109B
Architecture mixture-of-experts
Context 512K tokens
Released Apr 5, 2025
License Llama 4 Community License
Ollama llama4:scout

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M rec 67 GB 72 GB scout-q4_K_M
Q8_0 117 GB 125 GB scout-q8_0

Compatible Hardware

Q4_K_M requires 72 GB VRAM

Compatible Hardware

HardwareVRAMTypeFitEst. Speed
Mac Studio M4 Ultra 512GB512 GBmacRuns~11 tok/s
Mac Pro M2 Ultra 192GB192 GBmacRuns~11 tok/s
Mac Studio M4 Ultra 192GB192 GBmacRuns~11 tok/s
Mac Studio M4 Max 128GB128 GBmacRuns~8 tok/s
MacBook Pro M4 Max 128GB128 GBmacRuns~8 tok/s
MacBook Pro M5 Max 128GB128 GBmacRuns~8 tok/s
NVIDIA RTX PRO 6000 Blackwell96 GBgpuRuns~27 tok/s
MacBook Pro M3 Max 96GB96 GBmacRuns~6 tok/s
Mac mini M4 Pro 64GB64 GBmacCPU Offload~1 tok/s
Mac Studio M4 Max 64GB64 GBmacCPU Offload~2 tok/s
MacBook Pro M4 Max 64GB64 GBmacCPU Offload~2 tok/s
MacBook Pro M5 Max 64GB64 GBmacCPU Offload~2 tok/s
NVIDIA RTX 6000 Ada Generation48 GBgpuCPU Offload~4 tok/s
NVIDIA RTX A600048 GBgpuCPU Offload~3 tok/s
NVIDIA RTX PRO 5000 Blackwell48 GBgpuCPU Offload~4 tok/s
Mac mini M4 Pro 48GB48 GBmacCPU Offload~1 tok/s
MacBook Pro M3 Max 48GB48 GBmacCPU Offload~2 tok/s
MacBook Pro M4 Max 48GB48 GBmacCPU Offload~2 tok/s
MacBook Pro M4 Pro 48GB48 GBmacCPU Offload~1 tok/s
MacBook Pro M5 Max 48GB48 GBmacCPU Offload~2 tok/s
MacBook Pro M5 Pro 48GB48 GBmacCPU Offload~1 tok/s
86 hardware device(s) cannot run this model at Q4_K_M.

Benchmark Scores

80.0
mmlu