Skip to content

DeepSeek R1 8B

MIT

DeepSeek · 8B · transformer-decoder

2025-01-20 131K context 8B params

Use Cases

chat code reasoning math

Quantization Options

QuantBitsVRAMQualityStatus
Q4_K_Mrec47.5 GBGood
Q8_0811.5 GBGood
F161620.0 GBExcellent

About this model

DeepSeek R1 8B is a Llama 3.1-based distill of the full DeepSeek R1 reasoning model. It brings strong chain-of-thought reasoning to an 8B parameter size, making it accessible on consumer GPUs with 8-12 GB VRAM. Compared to the Qwen-based 7B distill, this Llama-based variant often shows better English performance. A solid choice for users who want reasoning capabilities without the VRAM requirements of the 14B or larger variants.

Benchmarks

70.0
mmlu