Model Overview
DeepSeek-R1 is an open-source reasoning model trained via large-scale reinforcement learning on DeepSeek-V3-Base, without relying on supervised fine-tuning as a preliminary step. It achieves performance comparable to OpenAI o1 across math, code, and reasoning tasks. The model naturally emerged with chain-of-thought reasoning behaviors including self-verification and reflection. DeepSeek also released six distilled dense models (1.5B, 7B, 8B, 14B, 32B, 70B) based on Qwen2.5 and Llama3 series. The latest version DeepSeek-R1-0528 significantly improves reasoning and inference capabilities, approaching O3 and Gemini 2.5 Pro on several benchmarks.