Skip to content

Mistral Nemo 12B

by Mistral AI · mistral family

12B

parameters

text-generation code-generation reasoning multilingual tool-use summarization

Mistral Nemo 12B was built jointly by Mistral AI and NVIDIA. It features a 128K context window and uses a Tekken tokenizer that's more efficient across languages than prior Mistral models. With 3.4M+ Ollama pulls, it's one of the most popular models at its size. At Q4 it fits on 12 GB GPUs comfortably, making it a strong contender alongside Gemma 3 12B. Excellent at function calling, multilingual tasks, and general instruction following.

Quick Start with Ollama

ollama run 12b-instruct-q4_K_M
Resources Ollama Hugging Face Official Page
Creator Mistral AI
Parameters 12B
Architecture transformer-decoder
Context 128K tokens
Released Jul 18, 2024
License Apache 2.0
Ollama mistral-nemo

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M rec 7.1 GB 9.5 GB 12b-instruct-q4_K_M
Q8_0 12.9 GB 16 GB 12b-instruct-q8_0
F16 24.5 GB 28 GB 12b-instruct-fp16

Compatible Hardware

Q4_K_M requires 9.5 GB VRAM

Compatible Hardware

HardwareVRAMTypeFitEst. Speed
Mac Studio M4 Ultra 512GB512 GBmacRuns~86 tok/s
Mac Pro M2 Ultra 192GB192 GBmacRuns~84 tok/s
Mac Studio M4 Ultra 192GB192 GBmacRuns~86 tok/s
Mac Studio M4 Max 128GB128 GBmacRuns~57 tok/s
MacBook Pro M4 Max 128GB128 GBmacRuns~57 tok/s
MacBook Pro M5 Max 128GB128 GBmacRuns~57 tok/s
NVIDIA RTX PRO 6000 Blackwell96 GBgpuRuns~202 tok/s
MacBook Pro M3 Max 96GB96 GBmacRuns~42 tok/s
Mac mini M4 Pro 64GB64 GBmacRuns~29 tok/s
Mac Studio M4 Max 64GB64 GBmacRuns~57 tok/s
MacBook Pro M4 Max 64GB64 GBmacRuns~57 tok/s
MacBook Pro M5 Max 64GB64 GBmacRuns~57 tok/s
NVIDIA RTX 6000 Ada Generation48 GBgpuRuns~101 tok/s
NVIDIA RTX A600048 GBgpuRuns~81 tok/s
NVIDIA RTX PRO 5000 Blackwell48 GBgpuRuns~101 tok/s
Mac mini M4 Pro 48GB48 GBmacRuns~29 tok/s
MacBook Pro M3 Max 48GB48 GBmacRuns~42 tok/s
MacBook Pro M4 Max 48GB48 GBmacRuns~57 tok/s
MacBook Pro M4 Pro 48GB48 GBmacRuns~29 tok/s
MacBook Pro M5 Max 48GB48 GBmacRuns~43 tok/s
MacBook Pro M5 Pro 48GB48 GBmacRuns~29 tok/s
Mac Studio M4 Max 36GB36 GBmacRuns~57 tok/s
MacBook Pro M3 Pro 36GB36 GBmacRuns~16 tok/s
MacBook Pro M5 Max 36GB36 GBmacRuns~43 tok/s
NVIDIA RTX 5000 Ada Generation32 GBgpuRuns~76 tok/s
NVIDIA GeForce RTX 509032 GBgpuRuns~189 tok/s
iMac M4 32GB32 GBmacRuns~13 tok/s
Mac mini M4 32GB32 GBmacRuns~13 tok/s
MacBook Air M4 32GB32 GBmacRuns~13 tok/s
MacBook Air M5 32GB32 GBmacRuns~13 tok/s
MacBook Pro M5 32GB32 GBmacRuns~13 tok/s
AMD Radeon RX 7900 XTX24 GBgpuRuns~101 tok/s
NVIDIA GeForce RTX 309024 GBgpuRuns~99 tok/s
NVIDIA GeForce RTX 3090 Ti24 GBgpuRuns~106 tok/s
NVIDIA GeForce RTX 409024 GBgpuRuns~106 tok/s
NVIDIA RTX A500024 GBgpuRuns~81 tok/s
iMac M3 24GB24 GBmacRuns~11 tok/s
Mac mini M2 24GB24 GBmacRuns~11 tok/s
Mac mini M4 Pro 24GB24 GBmacRuns~29 tok/s
MacBook Air M2 24GB24 GBmacRuns~11 tok/s
MacBook Air M4 24GB24 GBmacRuns~13 tok/s
MacBook Air M5 24GB24 GBmacRuns~13 tok/s
MacBook Pro M4 Pro 24GB24 GBmacRuns~29 tok/s
MacBook Pro M5 24GB24 GBmacRuns~13 tok/s
MacBook Pro M5 Pro 24GB24 GBmacRuns~29 tok/s
AMD Radeon RX 7900 XT20 GBgpuRuns~84 tok/s
NVIDIA RTX 4000 Ada Generation20 GBgpuRuns~38 tok/s
MacBook Pro M3 Pro 18GB18 GBmacRuns~16 tok/s
AMD Radeon RX 6900 XT16 GBgpuRuns~54 tok/s
AMD Radeon RX 6800 XT16 GBgpuRuns~54 tok/s
AMD Radeon RX 7800 XT16 GBgpuRuns~66 tok/s
AMD Radeon RX 9060 XT 16GB16 GBgpuRuns~57 tok/s
AMD Radeon RX 9070 XT16 GBgpuRuns~68 tok/s
AMD Radeon RX 907016 GBgpuRuns~57 tok/s
Intel Arc A77016 GBgpuRuns~59 tok/s
NVIDIA GeForce RTX 4060 Ti 16GB16 GBgpuRuns~30 tok/s
NVIDIA GeForce RTX 4070 Ti Super16 GBgpuRuns~71 tok/s
NVIDIA GeForce RTX 4080 Super16 GBgpuRuns~77 tok/s
NVIDIA GeForce RTX 408016 GBgpuRuns~75 tok/s
NVIDIA GeForce RTX 5060 Ti 16GB16 GBgpuRuns~47 tok/s
NVIDIA GeForce RTX 5070 Ti16 GBgpuRuns~94 tok/s
NVIDIA GeForce RTX 508016 GBgpuRuns~101 tok/s
NVIDIA RTX A400016 GBgpuRuns~47 tok/s
iMac M1 16GB16 GBmacRuns~7 tok/s
iMac M4 16GB16 GBmacRuns~13 tok/s
Mac mini M1 16GB16 GBmacRuns~7 tok/s
Mac mini M4 16GB16 GBmacRuns~13 tok/s
MacBook Air M2 16GB16 GBmacRuns~11 tok/s
MacBook Air M3 16GB16 GBmacRuns~11 tok/s
MacBook Air M4 16GB16 GBmacRuns~13 tok/s
MacBook Air M5 16GB16 GBmacRuns~13 tok/s
MacBook Pro M1 16GB16 GBmacRuns~7 tok/s
MacBook Pro M2 Pro 16GB16 GBmacRuns~21 tok/s
MacBook Pro M5 16GB16 GBmacRuns~13 tok/s
AMD Radeon RX 6700 XT12 GBgpuRuns~40 tok/s
AMD Radeon RX 7700 XT12 GBgpuRuns~45 tok/s
Intel Arc B58012 GBgpuRuns~48 tok/s
NVIDIA GeForce RTX 3060 12GB12 GBgpuRuns~38 tok/s
NVIDIA GeForce RTX 3080 12GB12 GBgpuRuns~96 tok/s
NVIDIA GeForce RTX 4070 Super12 GBgpuRuns~53 tok/s
NVIDIA GeForce RTX 4070 Ti12 GBgpuRuns~53 tok/s
NVIDIA GeForce RTX 407012 GBgpuRuns~53 tok/s
NVIDIA GeForce RTX 507012 GBgpuRuns~71 tok/s
NVIDIA GeForce GTX 1080 Ti11 GBgpuRuns (tight)~51 tok/s
NVIDIA GeForce RTX 2080 Ti11 GBgpuRuns (tight)~65 tok/s
Intel Arc B57010 GBgpuRuns (tight)~40 tok/s
NVIDIA GeForce RTX 3080 10GB10 GBgpuRuns (tight)~80 tok/s
AMD Radeon RX 6600 XT8 GBgpuCPU Offload~8 tok/s
AMD Radeon RX 76008 GBgpuCPU Offload~9 tok/s
AMD Radeon RX 9060 XT 8GB8 GBgpuCPU Offload~8 tok/s
Intel Arc A7508 GBgpuCPU Offload~16 tok/s
NVIDIA GeForce GTX 10708 GBgpuCPU Offload~8 tok/s
NVIDIA GeForce RTX 2060 Super8 GBgpuCPU Offload~14 tok/s
NVIDIA GeForce RTX 2070 Super8 GBgpuCPU Offload~14 tok/s
NVIDIA GeForce RTX 2080 Super8 GBgpuCPU Offload~16 tok/s
NVIDIA GeForce RTX 30508 GBgpuCPU Offload~7 tok/s
NVIDIA GeForce RTX 30708 GBgpuCPU Offload~14 tok/s
NVIDIA GeForce RTX 3060 Ti8 GBgpuCPU Offload~14 tok/s
NVIDIA GeForce RTX 4060 Ti 8GB8 GBgpuCPU Offload~9 tok/s
NVIDIA GeForce RTX 40608 GBgpuCPU Offload~9 tok/s
NVIDIA GeForce RTX 50508 GBgpuCPU Offload~7 tok/s
NVIDIA GeForce RTX 5060 Ti 8GB8 GBgpuCPU Offload~14 tok/s
NVIDIA GeForce RTX 50608 GBgpuCPU Offload~11 tok/s
MacBook Air M1 8GB8 GBmacCPU Offload~2 tok/s
MacBook Air M2 8GB8 GBmacCPU Offload~3 tok/s
2 hardware device(s) cannot run this model at Q4_K_M.

Benchmark Scores

68.0
mmlu