Skip to content

Model Families

19 model families with variants across different sizes.

Aya Expanse

Cohere

Cohere's Aya Expanse is a multilingual model family optimized for 23 languages. Available in 8B and 32B sizes, these mod...

2 variants 8B — 32B

Cogito

Deep Cogito

Deep Cogito's hybrid reasoning models that can dynamically switch between fast direct responses and deep chain-of-though...

3 variants 8B — 70B

Command R

Cohere

Cohere's Command R is a family of models optimized for retrieval-augmented generation (RAG) and enterprise use cases. Co...

3 variants 35B — 111B

DeepSeek R1

DeepSeek

DeepSeek's R1 family of reasoning-focused open-weight models, trained with reinforcement learning to excel at complex mu...

7 variants 1.5B — 671B

DeepSeek V3

DeepSeek

DeepSeek's V3 series of mixture-of-experts models with 671B total parameters and 37B active per token. Among the most ca...

3 variants 671B — 671B

Falcon 3

TII

The third generation of TII's Falcon models, offering efficient 7B and 10B parameter variants. Designed for strong gener...

2 variants 7B — 10B

Gemma 2

Google

Google's Gemma 2 is a family of lightweight, open-weight models built from the same research and technology used to crea...

3 variants 2B — 27B

Gemma 3

Google

Google's Gemma 3 is a major upgrade over Gemma 2, featuring native multimodal support (text + image input) starting at 4...

6 variants 1B — 27B

Gemma 4

Google

Google's Gemma 4 is the most capable open model family from Google DeepMind, released April 2026 under Apache 2.0. It sp...

4 variants 2B — 31B

GLM

Zhipu AI

Zhipu AI's GLM family includes the GLM-5 flagship reasoning model — a 744B parameter MoE with 40B active parameters per ...

1 variant 744B — 744B

Llama 3

Meta

Meta's Llama 3 is one of the most capable and widely adopted open-weight model families. Spanning from compact 1B parame...

8 variants 1B — 405B

Llama 4

Meta

Meta's Llama 4 introduces mixture-of-experts architecture and native multimodal support to the Llama family. Scout (109B...

2 variants 109B — 400B

Mistral

Mistral AI

Mistral AI's open-weight model family, known for exceptional efficiency and strong performance relative to model size. I...

10 variants 7B — 141B

Nemotron 3

NVIDIA

NVIDIA's Nemotron 3 family features novel hybrid architectures combining Mamba and Transformer blocks. Optimized for inf...

1 variant 8B — 8B

Phi

Microsoft

Microsoft's Phi family of small language models, designed to demonstrate that carefully curated training data can enable...

4 variants 3.8B — 14B

Qwen 2.5

Alibaba

Alibaba's Qwen 2.5 is a comprehensive family of open-weight models spanning from 7B to 72B parameters, with specialized ...

9 variants 7B — 72B

Qwen 3

Alibaba

Alibaba's Qwen 3 is the next generation of the Qwen family, featuring both dense models (0.6B to 32B) and mixture-of-exp...

8 variants 0.6B — 235B

Qwen 3.5

Alibaba

Alibaba's Qwen 3.5 is a multimodal model family spanning 0.8B to 397B parameters, supporting 201 languages with 256K con...

7 variants 0.8B — 122B

StarCoder

BigCode

BigCode's StarCoder is a family of code-specialized language models developed as part of an open scientific collaboratio...

1 variant 15B — 15B