From The Next Platform, April 10:
With MTIA v2 Chip, Meta Can Do AI Inference, But Not Training
If you control your code base and you have only a handful of applications that run at massive scale – what some have called hyperscale – then you, too, can win the Chip Jackpot like Meta Platforms and a few dozen companies and governments in the world have. If you win that jackpot, you are big enough and rich enough to co-design hardware to work precisely for and efficiently with your specific software, and vice versa.
Meta is one of the giants of the metaverse – hence its name change a few years back from Facebook – and it is also one of the biggest innovators in and users of AI in the world. Both require a tremendous amount of compute. And even though Meta Platforms will have spent somewhere around $15 billion for its fleet of 662,000 GPU accelerators – that is not the cost of systems, but just the cost of the GPUs – between 2017 and 2024 to buttress its AI ambitions, the company knows that AI training and inference costs have got to come radically down for the company to deploy AI at a larger scale than it currently wants to.
And so the company began designing its Meta Training and Inference Accelerator in 2020, and after three years of work, last May the company launched the MTIA v1 based on a significantly enhanced RISC-V architecture. We did a deep dive on MTIA v1, which was really aimed at AI inference workloads and particularly on the deep learning recommendation models (DLRMs) that drive the advertising and social networking applications in Facebook, Instagram, and other applications in the Meta stack.
With the MTIA v2 chip that was just revealed, Meta has built a much more capable device, setting it further down the road towards its independence from expensive and scarce GPU accelerators from Nvidia and AMD as well as other kinds of accelerators from myriad AI startups. Like MTIA v1, it an be used for inference, but not for AI training.
The blog post announcing the MTIA v2 device was written by Eran Tal, Nicolaas Viljoen, and Joel Coburn. Tal spent eight years at Nvidia in the 2000s and eventually became a senior systems design engineer for desktop GPUs. In 2010, Tal moved to Facebook to be a hardware engineer focusing on server design, did a stint helping Facebook co-design a mobile phone with Taiwanese consumer electronics maker HTC, was in charge of storage server designs for three years a decade ago, and then took over the Open Compute Project’s telecom hardware and software efforts (including OpenRAN). In April last year, Tal became director of hardware systems at Meta. Viljoen is the technical lead director of AI and network systems at Meta and was previously director of software engineering for DPU maker Netronome. Coburn has been a software engineer at Meta Platforms for over three years and was previously in the same position at Google for more than eight years. Suffice it to say, they know hardware and software and the nexus where they come together – or don’t....
....MUCH MORE
Again, here's Meta's blog post:
https://ai.meta.com/blog/next-generation-meta-training-inference-accelerator-AI-MTIA/
And at TweakTown, also April 10:
Meta's next-gen in-house AI chip is made on TSMC's 5nm process, with LPDDR5 RAM, not HBM
Meta's next-gen MTIA AI processor is made on TSMC 5nm, up to 1.35GHz frequency, PCIe Gen5 x8 interface, to fight NVIDIA in the cloud business.