As noted in our post following NVDA's February 27 earnings report, "Nvidia Plummets 8.5% On Ennui, Boredom (NVDA)":
I'm starting to think Mr. Huang is going to have to enter the big March conference by levitating, just to get the market's attention.
The stock is up a couple bucks after dropping $4.10 on Tuesday.
A slightly jaundiced view of the GTC goings-on from SemiAnalysis, March 19:
Next Generation Nvidia Systems, Ground Up Inference Optimizations from Silicon to Systems to Software, The More You Buy The More You Make
The Reasoning Token Explosion
AI model progress has accelerated tremendously, and in the last six months, models have improved more than in the previous six months. This trend will continue because three scaling laws are stacked together and working in tandem: pre-training scaling, post-training scaling, and inference time scaling.GTC this year is all about addressing the new scaling paradigms.
Source: Nvidia
Claude 3.7 shows incredible performance for software engineering. Deepseek v3 shows cost for last generation model capabilities are plummeting in price driving further adoption. OpenAI’s o1 and o3 models showed that longer inference time and search mean much better answers. Like the early days of pre-training laws, there is no limit in sight for adding more compute for post-training these models. This year’s GTC is focused on enabling the explosion in intelligence and tokens. Nvidia is focusing on massive 35x improvements in inference cost to enable the training and deployment of models.
Last year’s mantra was “the more you buy, the more you save,” but this year’s slogan is “the more you save, the more you buy.” The inference efficiencies delivered in Nvidia’s roadmaps on the hardware and software side unlock reasoning and agents in the cost-effective deployment of models and other transformational enterprise applications, allowing widespread proliferation and deployment—a classic example of Jevons’ paradox at work. Or how Jensen says it: “the more you buy, the more you make”.
The market is worried about this. The concern is that DeepSeek-style software optimization and increasing Nvidia-driven hardware improvements are leading to too much savings, meaning that the demand for AI hardware decreases and the market will be in a token glut. Price does influence demand, and as the price of intelligence decreases, the frontier of intelligence capabilities continues to push on, and then demand increases. Today’s capabilities are constrained in cost due to inference cost. AI’s actual impact on our lives is still in its infancy. As costs drop, net consumption paradoxically increases.
The concern on token deflation is akin to discussing the fiber bubble’s per-packet internet connectivity cost decreasing while ignoring the eventual impacts that websites and internet-driven applications would have on our lives, society, and economy. The key difference is that bandwidth demands are constrained, while demand for intelligence grows to infinity as capabilities improve drastically and costs fall.
Nvidia provides numbers supporting Jevons’ Paradox case. Models now take >100T tokens, and a reasoning model is 20x more tokens, and 150x more compute.
Test-time computing takes hundreds of thousands of tokens/queries, and there are hundreds of millions of queries per month. Post-training scaling, which is where models go to school, takes trillions of tokens per model, with hundreds of thousands of post-trained models. Additionally agentic AI means that multiple models will work together to solve harder and harder problems.
Jensen Math Changes Every Year
Every year, Jensen drops new math rules on the industry. Jensen Math is famously confusing and to add to the confusion this year, we now observe a third new Jensen math rule....
....MUCH MORE
The stock is now up $1.63 at $117.06. Ennui, Boredom.
Much more to come over the next few days.