Yogthos on Nostr: DeepSeek released their first full-fledged reasoning model, R1 under MIT license. R1 ...
DeepSeek released their first full-fledged reasoning model, R1 under MIT license.
R1 is trained using direct reinforcement learning, a form of unsupervised learning that doesn't rely on explicit labels or human-provided solutions. The model learns to solve problems independently by exploring various approaches and generating multiple potential answers that are scored and iterated upon.
https://www.interconnects.ai/p/deepseek-r1-recipe-for-o1
#technology
R1 is trained using direct reinforcement learning, a form of unsupervised learning that doesn't rely on explicit labels or human-provided solutions. The model learns to solve problems independently by exploring various approaches and generating multiple potential answers that are scored and iterated upon.
https://www.interconnects.ai/p/deepseek-r1-recipe-for-o1
#technology