Thursday, November 16, 2023

Britain's Most Powerful Supercomputer And The Butterfly Effect Of Weather Modeling In the Cloud

Great name. 

Isambard Kingdom Brunel may be the greatest engineer of all time and the second greatest Briton of all time, trailing Winston Churchill. (Æthelred II did not make the list)

From The Next Platform, November 14:

Will Isambard 4 Be The UK’s First True Exascale Machine?

UPDATED  Here is a story you don’t hear very often: A supercomputing center was just given a blank check up to the peak power consumption of its facility to build a world-class AI/HPC supercomputer instead of a sidecar partition with some GPUs to play around with and wish its researchers had a lot more capacity.

The downside is that somebody is going to lose some parking spots to make it all happen.

Back in May, we talked to Simon McIntosh-Smith, principal investigator for the Isambard project and a professor of HPC at the University of Bristol, about its next generation Isambard 3 system, which is a CPU-only cluster based on the superchip node of a pair of Nvidia’s 72-core “Grace” Arm CPUs, all lashed together with Hewlett Packard Enterprise’s Slingshot 11 interconnect. And at the time, McIntosh-Smith mentioned in passing that around £100 million of the £900 million ($1.12 billion) in funding from the British government to build an exascale supercomputer in the United Kingdom by 2026 was earmarked for short-term, relatively quick deployment of AI infrastructure. So the GW4 collective – that’s the universities of Bath, Bristol, Cardiff, and Exeter – in the United Kingdom started thinking about how they might get some of that £100 million and build a beefier AI partition for Isambard 3.

Isambard 3 already has two partitions. One is based entirely on the Grace-Grace superchip and has 384 nodes across six racks of HPE Cray XD2000 machinery (that’s 64 Grace-Grace superchips per rack) to bring 55,296 Arm cores to bear on HPC workloads. The Grace chip uses the Neoverse “Demeter” V2 cores from Arm Ltd, which are based on the ArmV9 architecture, which we detailed back in August 2022. The other – and smaller – partition in the Isambard 3 system has 32 nodes with Nvidia’s “Hopper” H100 GPU accelerators, and specifically, these nodes use the Grace-Hopper superchips in a one-to-one ratio. All of the nodes have 256 GB of LPDDR5 memory on them, and the GPUs have their own 80 GB of HBM3 memory, of course. By using the Grace CPUs, the GW4 collective tick a few important boxes for a supercomputer in the UK, mainly that it is based on the homegrown Arm architecture and that the nodes are energy efficient and compact.

In the early summer, the UK government put out a call to various HPC centers in the country on how they might spend £100 million to build up some AI capability, and in July GW4 submitted its proposal.

“It was a bit later that month,” McIntosh-Smith tells The Next Platform, “that we got a call saying they really liked our proposal. And then they asked: What are your limits? How big could it go? And we were like, ‘Okay, that’s not the sort of question we’re used to.’ They asked if we were limited by space or are we limited by time, or are we limited by power? So we looked into it, and the first limit we would hit was power. We have five megawatts leftover at the site where Isambard 3 is going, and that is just over 5,000 Grace-Hoppers worth of power. And the government basically said, “That’s great. You basically fill it up as much as you can and tell us how much that would cost.’”

As it turns out, that costs £225 million ($281 million), and that also means that the Isambard-AI machine is not some sidecar strapped onto the back-end of Isambard 3, but a supercomputer in its own right and, as it turns out, what will be the most powerful machine in the United Kingdom when it is installed and running next summer.

The Butterfly Effect Of Weather Modeling In the Cloud....

....MUCH MORE

Back in 2021 we happened to catch NVIDIA, at that time attempting to get regulatory approvals to purchase ARM Holdings announce:

"Nvidia To Invest At Least $100 Million In UK Supercomputer"

That 'puter was the fastest in Britain. From Nvidia:

NVIDIA Launches UK’s Most Powerful Supercomputer, for Research in AI and Healthcare

NVIDIA CEO Unveils ‘First Big Bet’ on Digital Biology Revolution with UK-Based Cambridge-1

First Wave of Startups Harnesses UK’s Most Powerful Supercomputer to Power Digital Biology Breakthroughs

Nvidia didn't get to own ARM outright, having to settle for a fraction but the Cambridge-1 they built as an enticement is a very interesting, very fast piece of machinery.

And next year it will be just another supercomputer.