DeepSeek R1 LLM Outshines Rival Reasoning Models

www.analyticsdrift.com Image source: Analytics Drift

Chinese AI Lab DeepSeek Unveils DeepSeek R1

[{"selector":"#anim-3674aa32-84e1-424a-9d86-f0e6e6b039a7","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-68806683-27a4-430d-893c-7e6574d1fb5f","keyframes":{"transform":["translate3d(0px, 163.24788%, 0)","translate3d(0px, 0px, 0)"]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-f3f59a3f-0859-40e2-9bcc-0a780c53b672","keyframes":{"opacity":[0,1]},"delay":120,"duration":1300,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-613a2cf7-8ac0-489a-a50f-b4fdf7de75ee","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] On January 20, 2025, Chinese AI company DeepSeek launched the DeepSeek R1 large language model. This model is designed to mimic human reasoning and provide advanced analytical capabilities. Image source: Deepseek

This LLM Stuns the Global Tech Community

[{"selector":"#anim-3db397b0-c504-4ab6-923b-550c428ea4f5","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-c5a37cfd-083a-44ea-a737-676f6278ee03","keyframes":{"transform":["translate3d(0px, 160.47498%, 0)","translate3d(0px, 0px, 0)"]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-a02110d2-f961-49ca-8ffe-08aa6963f748","keyframes":{"opacity":[0,1]},"delay":120,"duration":1300,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-f92f1356-6374-4f08-a420-46c1d29a0512","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] Compared to its competitors, OpenAI o1-mini, GPT-4o, and Claude 3.5, DeepSeek R1 has demonstrated superior performance in chemistry, coding, and mathematics. Image source: Canva

DeepSeek R1’s Impressive Results

[{"selector":"#anim-c4e3fd9c-7a82-4b99-a093-7b2565ae2a27","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-45f697fa-f9d7-4166-9208-a5802baae103","keyframes":{"transform":["translate3d(0px, 176.35067%, 0)","translate3d(0px, 0px, 0)"]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-773f524a-863b-45e3-b957-f519c24926d9","keyframes":{"opacity":[0,1]},"delay":120,"duration":1300,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-37e760ed-ece2-4874-8f9b-1b23b84c192a","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] The model has secured a 96.3 percentile rank compared to human participants, achieving exceptional benchmarks in MATH-500 (Pass@1) and GPQA Diamond (Pass@1). Image source: Deepseek

The Architecture Behind This LLM

[{"selector":"#anim-dabfb699-0e06-49ee-aacb-688745d8e56d","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-3ac58505-1cd1-47f9-807e-a767ab64aa58","keyframes":{"transform":["translate3d(0px, 164.78631%, 0)","translate3d(0px, 0px, 0)"]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-0ef9a12d-752e-4cc2-8fd6-b2b275a942d5","keyframes":{"opacity":[0,1]},"delay":120,"duration":1300,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-b22d0f81-3bd2-4207-9a01-90c557713268","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] This state-of-the-art model has a Mixture-of-Experts (MoE) architecture, similar to its company’s predecessor model, DeepSeek V-3—an open-source language model with 671 billion parameters. Image source: Deepseek

DeepSeek R1 Versions

[{"selector":"#anim-9a87ab98-4499-4bad-ab99-f47fcb828401","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-5a151250-5fa2-47c4-b24a-9be862e8dd3a","keyframes":{"transform":["translate3d(0px, 165.55553%, 0)","translate3d(0px, 0px, 0)"]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-c867e57c-316c-4267-9d56-a87a793cf2bc","keyframes":{"opacity":[0,1]},"delay":120,"duration":1300,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-51666c13-8413-429a-9632-c64b3e772c1c","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] The DeepSeek R1 model includes two versions, DeepSeek-R1-Zero and DeepSeek-R1. The former has only been trained in reinforcement learning without supervised fine-tuning. Image source: Deepseek

DeepSeek R1’s Economical Token Pricing

[{"selector":"#anim-926a9afa-4db7-4af9-9b2c-077bc0a78ff5","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-f14e77b5-7a63-4533-a73a-8791dad50b71","keyframes":{"transform":["translate3d(0px, 162.00173%, 0)","translate3d(0px, 0px, 0)"]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.2, 0.6, 0.0, 1)","fill":"both"}] [{"selector":"#anim-13b3e3be-1852-40f6-9951-fbe85fb5ac85","keyframes":{"opacity":[0,1]},"delay":120,"duration":1300,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] [{"selector":"#anim-86142e59-a2f2-4f98-838e-ac6daba509b5","keyframes":{"opacity":[0,1]},"delay":120,"duration":1200,"easing":"cubic-bezier(0.4, 0.4, 0.0, 1)","fill":"both"}] Although the DeepSeek R1 model is not fully open-source, it offers significantly lower token prices than OpenAI. This makes the model a more affordable option for researchers and developers. Image source: Deepseek Read more