NVIDIA GeForce RTX 3080 12GB
NVIDIA · 12GB GDDR6X · Can run 16 models
| Manufacturer | NVIDIA |
| VRAM | 12 GB |
| Memory Type | GDDR6X |
| Architecture | Ampere |
| CUDA Cores | 8,960 |
| Tensor Cores | 280 |
| TDP | 350W |
| MSRP | $799 |
| Released | Jan 11, 2022 |
AI Notes
The RTX 3080 12GB is a decent option for running local AI models. With 12GB of GDDR6X VRAM, it can handle 7B models at full precision and 13B models with quantization. Its older Ampere architecture is slower per core than Ada Lovelace but still delivers solid inference performance.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit |
|---|---|---|---|---|
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | Runs |
| Mistral 7B | 7B | Q8_0 | 9 GB | Runs |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | Runs |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | Runs |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | Runs |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | Runs |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | Runs |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | Runs |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | Runs (tight) |
| StarCoder2 15B | 15B | Q8_0 | 17 GB | CPU Offload |
| Codestral 22B | 22B | Q4_K_M | 14.7 GB | CPU Offload |
| Gemma 2 27B | 27B | Q4_K_M | 17.7 GB | CPU Offload |
9
model(s) are too large for this hardware.