Tuesday, January 28, 2025

"It is 'categorically false that China duplicated OpenAI for $5 million,' a veteran Wall Street tech analyst said Monday."

 From Barron's, January 27, 4:29 PM EST:

DeepSeek Sparked a Market Panic. Here Are the Facts.

Social media never lets facts get in the way of a good story.

Over the weekend, viral posts suggested that Chinese company DeepSeek had recreated OpenAI’s artificial intelligence prowess for just $6 million, versus the billions spent by U.S. tech giants. The runaway hype quickly raised questions about America’s AI leadership and tanked tech stocks on Monday. The Nasdaq Composite finished the day down 3.1%, while AI leader Nvidia tumbled 17%.

But the reality is far more complicated. DeepSeek didn’t simply replicate OpenAI’s ability by spending a few million dollars.

DeepSeek first unveiled the $6 million figure in a late December technical paper for its DeepSeek-V3 model. The start-up estimated the model’s final training run, taking 2.8 million GPU hours, would cost $5.6 million if it rented that amount of cloud capacity. Importantly, DeepSeek excluded costs related to “prior research and ablation experiments on architectures, algorithms, or data.”

This means the number omits all R&D funds spent developing the model’s architecture, algorithms, data acquisition, employee salaries, buying GPUs, and test runs. Comparing a theoretical final run training cost with overall U.S. company spending on AI infrastructure capital expenditures is comparing apples and oranges. DeepSeek’s overall cost is likely much higher.

On Monday, Bernstein analyst Stacy Rasgon cited DeepSeek’s disclosure, noting a “fundamental misunderstanding” over the $5 million figure. It is “categorically false that China duplicated OpenAI for $5 million.”

Technology fund manager Gavin Baker called using the $6 million training figure “deeply misleading,” emphasizing that a smart team couldn’t train the DeepSeek model from scratch with a few million dollars.

Several AI experts strongly suspect that DeepSeek used advanced U.S. model outputs in addition to its own to optimize its models through a process called distillation, improving smaller models’ capability by using larger models.

Recent news out of China, meanwhile, debunks the idea of AI on the cheap. Last week, China announced plans to provide $137 billion in financial support for AI over the next few years. DeepSeek founder Liang Wenfeng reportedly told Chinese Premier Li Qiang last week that American export restrictions on AI GPUs remained a “bottleneck,” according to The Wall Street Journal.

This all means that global technology companies are likely to keep spending on AI infrastructure to train new advanced models and develop the next generation technology.

Amid the DeepSeek frenzy, Meta Platforms CEO Mark Zuckerberg announced on Friday his company would invest $60 billion to 65 billion on capital expenditures this year while significantly growing its AI teams. Last October, Meta gave guidance for 2024 capex of $38 billion to $40 billion. “This will be a defining year for AI,” Zuckerberg wrote Friday on Facebook, adding that Meta is building a 2+ gigawatt data center and will have over 1.3 million GPUs by year-end.

To be clear, there are important things to be gleaned from DeepSeek. In the wake of its new models, there are new questions about computing capacity needed for AI inference, the process of generating results from AI models....

....MUCH MORE