Gemma 2 9B

by Google · gemma-2 family

9B

parameters

text-generation code-generation reasoning multilingual summarization

Gemma 2 9B is Google's mid-range open model that punches well above its weight, outperforming many larger models on key benchmarks. It incorporates knowledge distillation from larger Gemma models, resulting in exceptional quality for its parameter count. The model excels at reasoning, text generation, and multilingual tasks. With its 8K context window and moderate resource requirements, it is an excellent choice for users seeking strong general-purpose performance on consumer-grade hardware.

Quick Start with Ollama

ollama run 9b-instruct-q8_0
Creator Google
Parameters 9B
Architecture transformer-decoder
Context Length 8K tokens
License Gemma Terms of Use
Released Jun 27, 2024
Ollama gemma2

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M 4.4 GB 6.9 GB
9b-instruct-q4_K_M
Q8_0 recommended 8.1 GB 11 GB
9b-instruct-q8_0
F16 17.1 GB 20 GB
9b-instruct-fp16

Compatible Hardware for Q8_0

Showing compatibility for the recommended quantization (Q8_0, 11 GB VRAM).

Compatible Hardware

Hardware VRAM Type Fit
Mac Pro M2 Ultra 192GB 192 GB mac Runs
Mac Studio M4 Ultra 192GB 192 GB mac Runs
Mac Studio M4 Max 128GB 128 GB mac Runs
MacBook Pro M4 Max 128GB 128 GB mac Runs
Mac Studio M4 Max 64GB 64 GB mac Runs
MacBook Pro M4 Max 64GB 64 GB mac Runs
Mac mini M4 Pro 48GB 48 GB mac Runs
MacBook Pro M4 Max 48GB 48 GB mac Runs
MacBook Pro M4 Pro 48GB 48 GB mac Runs
NVIDIA GeForce RTX 5090 32 GB gpu Runs
Mac mini M4 32GB 32 GB mac Runs
AMD Radeon RX 7900 XTX 24 GB gpu Runs
NVIDIA GeForce RTX 3090 24 GB gpu Runs
NVIDIA GeForce RTX 4090 24 GB gpu Runs
Mac mini M4 Pro 24GB 24 GB mac Runs
MacBook Air M4 24GB 24 GB mac Runs
MacBook Pro M4 Pro 24GB 24 GB mac Runs
AMD Radeon RX 7900 XT 20 GB gpu Runs
AMD Radeon RX 7800 XT 16 GB gpu Runs
Intel Arc A770 16 GB gpu Runs
NVIDIA GeForce RTX 4060 Ti 16GB 16 GB gpu Runs
NVIDIA GeForce RTX 4070 Ti Super 16 GB gpu Runs
NVIDIA GeForce RTX 4080 16 GB gpu Runs
NVIDIA GeForce RTX 5080 16 GB gpu Runs
Mac mini M4 16GB 16 GB mac Runs
MacBook Air M3 16GB 16 GB mac Runs
MacBook Air M4 16GB 16 GB mac Runs
NVIDIA GeForce RTX 3060 12GB 12 GB gpu Runs (tight)
NVIDIA GeForce RTX 3080 12GB 12 GB gpu Runs (tight)
NVIDIA GeForce RTX 4070 Ti 12 GB gpu Runs (tight)
NVIDIA GeForce RTX 4070 12 GB gpu Runs (tight)
AMD Radeon RX 7600 8 GB gpu CPU Offload
Intel Arc A750 8 GB gpu CPU Offload
NVIDIA GeForce RTX 3070 8 GB gpu CPU Offload
NVIDIA GeForce RTX 4060 Ti 8GB 8 GB gpu CPU Offload
NVIDIA GeForce RTX 4060 8 GB gpu CPU Offload

Benchmark Scores

71.3
mmlu