Wednesday, August 13, 2025

"OpenAI's GPT-5 looks less like AI evolution and more like cost cutting"

From The Register, August 13: 

Gotta pay for all those GPUs somehow

Comment For all the superlative-laden claims, OpenAI's new top model appears to be less of an advancement and more of a way to save compute costs — something that hasn't exactly gone over well with the company's most dedicated users.

As the flag bearer that kicked off the generative AI era, OpenAI is under considerable pressure not only to demonstrate technological advances, but also to justify its massive, multi-billion-dollar funding rounds by showing its business is growing.

To do that, OpenAI can either increase its user base, raise prices, or cut costs. Much of the industry is already aligning around its $20 and $200 a month pricing tiers. So OpenAI would need to offer something others cannot to justify a premium, or risk losing customers to competitors such as Anthropic or Google.

With the academic year about to kick off, OpenAI is sure to pick up a fresh round of subscriptions as students file back into classrooms following the summer break. While more paying customers will mean more revenues, it also means higher compute costs.

Enter the cost-cutting era.

Perhaps the best evidence of cost-cutting is the fact that GPT-5 isn't actually one model. It's a collection of at least two models: a lightweight LLM that can quickly respond to most requests and a heavier duty one designed to tackle more complex topics. Which model prompts land in is determined by a router model, which acts a bit like an intelligent load balancer for the platform as a whole. Image prompts use a completely different model, Image Gen 4o.

This is a departure from how OpenAI has operated in the past. Previously, Plus and Pro users have been able to choose which model they'd like to use. If you wanted to ask o3 mundane questions that GPT-4o could have easily handled, you could.

In theory, OpenAI's router model should allow the bulk of GPT-5's traffic to be served by its smaller, less resource-intensive models.

We can see more evidence of cost-cutting in OpenAI's decision to toggle reasoning on and off by default automatically, depending on the complexity of the prompt. Freeloaders... we mean free tier users, don't have the ability to toggle this on themselves. The less reasoning the models are doing, the fewer tokens they generate and the less expensive they are to operate....

....MUCH MORE 

One of the sidebar stories at El Reg is headlined: "GPT-5 is going so well for OpenAI that there's now a 'show additional models' switch" Ouch.