Tuesday, June 19, 2018

Sandia National Lab to Install First Petascale Supercomputer Powered by ARM Processors (Masayoshi Son smiles)

Softbank's chairman paid quite a bit to get his hands on the crown jewel of British high tech.

From Top500:
Sandia National Laboratories will soon be taking delivery of the world’s most powerful supercomputer using ARM processors. The system, known as Astra, is being built by Hewlett Packard Enterprise (HPE) and will deliver 2.3 petaflops of peak performance when it’s installed later this year. 
Astra rendering. Source: HPE

“Sandia National Laboratories has been an active partner in leveraging our Arm-based platform since its early design, and featuring it in the deployment of the world’s largest Arm-based supercomputer, is a historical moment not just for us, but for the industry as we race toward achieving exascale computing,” said Mike Vildibill, vice president, Advanced Technology Group, HPE

Astra will be based on HPE’s Apollo 70 system and will be comprised of 2,592 dual-socket nodes, containing 145,000 cores – by far the largest such system the company has delivered. If it was up and running today, it would easily make it into the upper fifth of the TOP500 list.

Each node will be equipped with two 28-core Cavium ThunderX2 processors running at 2.0 GHz. These aren’t the biggest or the fastest of Cavium’s newest ARM processor, but represents something of a sweet spot in price-performance. In aggregate, the compute nodes will draw 1.2 MW of power, which translates into a respectable energy efficiency for a 2.3-petaflop machine.

Local storage will be supplied by Apollo A4520 enclosures, providing 350 TB in the form of an all-flash Lustre appliance. Because of the relatively small capacity and high performance, it will primarily be used for operations needing extreme I/O bandwidth – things like burst buffering and file checkpointing.

Prior to the Astra announcement, most of the other action with regard to ARM-powered HPC was taking place in the United Kingdom. HPE had previously announced that three UK universities (Edinburgh, Leicester, and Bristol) had ordered Apollo 70 clusters, but each of these systems will be outfitted with just 64 nodes and will top out at a mere 74 teraflops. As far as computational capacity goes, the closest thing to Astra is Isambard, a 10,000-core Cray XC50 supercomputer using these same ThunderX2 processors. It’s set to be deployed at the Great Western 4 (GW4) Alliance, a research consortium of four UK universities (Bristol, Bath, Cardiff and Exeter).

Astra’s delivery is the first production deployment of the of the Department of Energy’s (DOE) National Nuclear Security Administration’s (NNSA) Vanguard Project. The project’s mission is to ensure a viable HPC ecosystem is established for ARM technology within the NNSA and the larger DOE community. Besides Sandia, a number of other national labs are involved in the project, including Lawrence Livermore, Oak Ridge, Argonne, and Los Alamos....

Keeping in mind the new Summit supercomputer at Oak Ridge turns over at 200 AI-tuned petaflops this latest 'puter is more about offering a different supercomputer architecture, what with the tech Cavium licenses from ARM.

"IBM builds world’s most powerful supercomputer to crack AI" (IBM; NVDA)