Llama 3.1 405B

Name: Llama 3.1 405B
Author: Meta

405B

parameters

text-generation code-generation reasoning multilingual tool-use math creative-writing summarization

Llama 3.1 405B is the largest and most capable model in the Llama family, representing Meta's flagship open-source release. It competes directly with leading proprietary models on benchmarks spanning reasoning, coding, math, and multilingual understanding. Running this model locally requires enterprise-grade hardware with multiple high-end GPUs. However, for users with the necessary infrastructure, it provides state-of-the-art open-source performance without any API dependencies.

Quick Start with Ollama


ollama run 405b-instruct-q4_K_M

Creator	Meta
Parameters	405B
Architecture	transformer-decoder
Context Length	128K tokens
License	Llama 3.1 Community License
Released	Jul 23, 2024
Ollama	llama3.1:405b

Quantization Options

Format	File Size	VRAM Required	Quality	Ollama Tag
Q4_K_M recommended	196 GB	244.5 GB	★ ★ ★ ★ ★	`405b-instruct-q4_K_M`
Q5_K_M	228.8 GB	285 GB	★ ★ ★ ★ ★	`405b-instruct-q5_K_M`

Compatible Hardware for Q4_K_M

Showing compatibility for the recommended quantization (Q4_K_M, 244.5 GB VRAM).

Compatible Hardware

Hardware	VRAM	Type	Fit
Mac Pro M2 Ultra 192GB	192 GB	mac	CPU Offload
Mac Studio M4 Ultra 192GB	192 GB	mac	CPU Offload

34 hardware device(s) cannot run this model configuration.

Benchmark Scores

87.3

mmlu