DeepSeek released their first full-fledged reasoning model, R1 under MIT license. R1 ...

2025-01-22 02:47:09

DeepSeek released their first full-fledged reasoning model, R1 under MIT license.

R1 is trained using direct reinforcement learning, a form of unsupervised learning that doesn't rely on explicit labels or human-provided solutions. The model learns to solve problems independently by exploring various approaches and generating multiple potential answers that are scored and iterated upon.

https://www.interconnects.ai/p/deepseek-r1-recipe-for-o1

#technology

Author Public Key

npub1k8wtv8evmj84ervjyf4qwnp2yw7j64egldx57fm3gtlfzx0s8yks6vcv65

Show more details

Yogthos on Nostr: DeepSeek released their first full-fledged reasoning model, R1 under MIT license. R1 ...