Monday, November 17, 2025

"Power: The answer to and source of all your AI datacenter problems"

From The Register, November 15: 

Digital Realty CTO Chris Sharp weights impact of densification on the datacenter and the rise of the AI factory 

Interview In the datacenter biz, power is the product. You either have it or you don't, Chris Sharp tells El Reg.

The CTO of colocation provider Digital Realty explains that without power, there are no servers, no storage, no GPUs, and none of those AI tokens that have Wall Street in a frency. But power isn't only the limiting factor in the US and much of the world, it has also upended the way datacenters are designed and built.

Over the past few years, GPU servers have transitioned from air-cooled machines that didn't require much if any additional work to deploy in a typical datacenter to something more reminiscent of the bespoke HPC clusters built by the likes of Cray, Eviden, or Lenovo.

This change didn't take place overnight. Nvidia's Ampere generation of GPUs introduced in 2020 didn't really require a fundamental shift in the company's approach to cooling or thermal management, Sharp told us.

But in the intervening years the winds were changing. GPU servers were not only growing more power hungry but were in now in high demand, and deploying them was no longer as simple as racking and stacking servers.

The chief constraints: power and density. In the past five years we've gone from air-cooled systems that might pull 6 or 7 kilowatts under load to liquid-cooled rack-scale behemoths with more than 120 kW of compute on board.

"More times than not, customers are like, 'Okay, I broke through, and I'm free of the supply constraint, I have my chips,' and I have to say: slow down; there's a lot of other things you're going to need," Sharp said.

For X amount of GPUs you now need so many switches, storage servers, power delivery units, and coolant distribution units. In the case of Nvidia's densest systems, existing datacenters may not even be able to support the physical load.

"Silicon innovation is going to be hampered by the permanence of concrete in the datacenter," Sharp said. "There's a potential to have bought your infrastructure, and you do not have a place where it goes"

There are a lot of colocation providers that can't handle this level of densification, or, if they can, that doesn't mean they're prepared to support the next generation of compute platforms, he adds.

This is a problem for hardware vendors like Nvidia and AMD, who believe that as Moore's Law slows and advancements in silicon density and energy efficiency become fewer and further between, the best path forward is packing larger and larger chips closer together.

Today, Nvidia's rack systems are hovering around 140kW in compute capacity. But we've yet to reach a limit.  By 2027, Nvidia plans to launch 600kW racks which pack 576 GPU dies into the space one occupied by just 32....

....MUCH MORE 

Reality bytes.