NVIDIA GeForce RTX 5080

NVIDIA · 16GB GDDR7 · Can run 19 models

Manufacturer NVIDIA
VRAM 16 GB
Memory Type GDDR7
Architecture Blackwell
CUDA Cores 10,752
Tensor Cores 336
TDP 360W
MSRP $999
Released Jan 30, 2025

AI Notes

The RTX 5080 offers a strong balance of performance and VRAM for local AI. With 16GB of GDDR7, it comfortably runs 13B-parameter models and can handle some 30B models with aggressive quantization. Blackwell architecture tensor cores provide excellent inference speed for its price tier.

Compatible Models

Model Parameters Best Quant VRAM Used Fit
Llama 3.2 1B 1B Q8_0 3 GB Runs
Gemma 2 2B 2B Q8_0 4 GB Runs
Llama 3.2 3B 3B Q8_0 5 GB Runs
Phi-3 Mini 3.8B 3.8B Q8_0 5.8 GB Runs
DeepSeek R1 7B 7B Q8_0 9 GB Runs
Mistral 7B 7B Q8_0 9 GB Runs
Qwen 2.5 7B 7B Q8_0 9 GB Runs
Qwen 2.5 Coder 7B 7B Q8_0 9 GB Runs
Llama 3.1 8B 8B Q8_0 10 GB Runs
Gemma 2 9B 9B Q8_0 11 GB Runs
DeepSeek R1 14B 14B Q4_K_M 9.9 GB Runs
Phi-4 14B 14B Q4_K_M 9.9 GB Runs
Qwen 2.5 14B 14B Q4_K_M 9.9 GB Runs
Codestral 22B 22B Q4_K_M 14.7 GB Runs (tight)
StarCoder2 15B 15B Q8_0 17 GB CPU Offload
Gemma 2 27B 27B Q4_K_M 17.7 GB CPU Offload
DeepSeek R1 32B 32B Q4_K_M 20.7 GB CPU Offload
Qwen 2.5 32B 32B Q4_K_M 20.7 GB CPU Offload
Command R 35B 35B Q4_K_M 22.5 GB CPU Offload
6 model(s) are too large for this hardware.