Monday, January 27, 2025

More VentureBeat On DeepSeek: "DeepSeek R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost"

As the twitterverse was hair-on-fire crazy this weekend, VentureBeat had the best reporting, January 25's - "Why everyone in AI is freaking out about DeepSeek"

Here's more. First the headliner, also January 25: 

DeepSeek R1’s Monday release has sent shockwaves through the AI community, disrupting assumptions about what’s required to achieve cutting-edge AI performance. Matching OpenAI’s o1 at just 3%-5% of the cost, this open-source model has not only captivated developers but also challenges enterprises to rethink their AI strategies.

The model has rocketed to the top-trending model being downloaded on HuggingFace (109,000 times, as of this writing) – as developers rush to try it out and seek to understand what it means for their AI development. Users are commenting that DeepSeek’s accompanying search feature (which you can find at DeepSeek’s site) is now superior to competitors like OpenAI and Perplexity, and is only rivaled by Google’s Gemini Deep Research.

The implications for enterprise AI strategies are profound: With reduced costs and open access, enterprises now have an alternative to costly proprietary models like OpenAI’s. DeepSeek’s release could democratize access to cutting-edge AI capabilities, enabling smaller organizations to compete effectively in the AI arms race.

This story focuses on exactly how DeepSeek managed this feat, and what it means for the vast number of users of AI models. For enterprises developing AI-driven solutions, DeepSeek’s breakthrough challenges assumptions of OpenAI’s dominance — and offers a blueprint for cost-efficient innovation. It’s the “how” DeepSeek did what it did that should be the most educational here.

DeepSeek’s breakthrough: Moving to pure reinforcement learning
In November, DeepSeek made headlines with its announcement that it had achieved performance surpassing OpenAI’s o1, but at the time it only offered a limited R1-lite-preview model. With Monday’s full release of R1 and the accompanying technical paper, the company revealed a surprising innovation: a deliberate departure from the conventional supervised fine-tuning (SFT) process widely used in training large language models (LLMs).

SFT, a standard step in AI development, involves training models on curated datasets to teach step-by-step reasoning, often referred to as chain-of-thought (CoT). It is considered essential for improving reasoning capabilities. However, DeepSeek challenged this assumption by skipping SFT entirely, opting instead to rely on reinforcement learning (RL) to train the model.

This bold move forced DeepSeek-R1 to develop independent reasoning abilities, avoiding the brittleness often introduced by prescriptive datasets. While some flaws emerge – leading the team to reintroduce a limited amount of SFT during the final stages of building the model – the results confirmed the fundamental breakthrough: reinforcement learning alone could drive substantial performance gains.

The company got much of the way using open source – a conventional and unsurprising way
First, some background on how DeepSeek got to where it did. DeepSeek, a 2023 spin-off from Chinese hedge-fund High-Flyer Quant, began by developing AI models for its proprietary chatbot before releasing them for public use. Little is known about the company’s exact approach, but it quickly open sourced its models, and it’s extremely likely that the company built upon the open projects produced by Meta, for example the Llama model, and ML library Pytorch....

....MUCH MORE 

And January 24:

Tech leaders respond to the rapid rise of DeepSeek

....Yann LeCun, the Chief AI Scientist for Meta’s Fundamental AI Research (FAIR) division, posted on his LinkedIn account:

“To people who see the performance of DeepSeek and think:
‘China is surpassing the US in AI.’
You are reading this wrong.
The correct reading is:
‘Open source models are surpassing proprietary ones.’

DeepSeek has profited from open research and open source (e.g. PyTorch and Llama from Meta)
They came up with new ideas and built them on top of other people’s work.
Because their work is published and open source, everyone can profit from it.
That is the power of open research and open source.”
....

Previously on the LeCun channel:

November 2024 - Chief AI Scientist at Meta, Yann LeCun: "I don't wanna say "I told you so", but I told you so."

February 2024 - "Meta’s A.I. Chief Yann LeCun Explains Why a House Cat Is Smarter Than The Best A.I." 

And many more.

And most recently on DeepSeek: