Friday, September 22, 2023

"SambaNova’s New Chip Means GPTs for Everyone"

From IEEE Spectrum, September 20:

In this AI startup’s new world, 5 trillion parameter models require just eight chips 

Chips and talent are in short supply as companies scrabble to jump on the AI bandwagon. Startup SambaNova claims its new processor could help companies have their own large language model (LLM) up and running in just days.

The Palo Alto–based company, which has raised more than US $1 billion in venture funding, won’t be selling the chip directly to companies. Instead it sells access to its custom-built technology stack, which features proprietary hardware and software especially designed to run the largest AI models.

That technology stack has now received a major upgrade following the launch of the company’s new SN40L processor. Built using Taiwanese chip giant Taiwan Semiconductor Manufacturing Co.’s 5-nanometer process, each device features 102 billion transistors spread across 1,040 cores that are capable of speeds as high as 638 teraflops. It also has a novel three-tier memory system designed to cope with the huge data flows associated with AI workloads.

“A trillion parameters is actually not a big model if you can run it on eight [chips].”
—Rodrigo Liang, SambaNova

SambaNova claims that a node made up of just eight of these chips is capable of supporting models with as many as 5 trillion parameters, which is almost three times the reported size of OpenAI’s GPT-4 LLM. And that’s with a sequence length—a measure of the length of input a model can handle—as high as 256,000 tokens. Doing the same using industry-standard GPUs would require hundreds of chips, claims CEO Rodrigo Liang, representing a total cost of ownership less than 1/25 of the industry-standard approach....

....MUCH MORE

Also at IEEE Spectrum (Sept. 18):

Nvidia Still on Top in Machine Learning; Intel Chasing
The latest MLPerf inferencing benchmarks include Nvidia’s Grace Hopper superchip  

Large language models like Llama 2 and ChatGPT are where much of the action is in AI. But how well do today’s data center–class computers execute them? Pretty well, according to the latest set of benchmark results for machine learning, with the best able to summarize more than 100 articles in a second. MLPerf’s twice-a-year data delivery was released on 11 September and included, for the first time, a test of a large language model (LLM), GPT-J. Fifteen computer companies submitted performance results in this first LLM trial, adding to the more than 13,000 other results submitted by a total of 26 companies. In one of the highlights of the data-center category, Nvidia revealed the first benchmark results for its Grace Hopper—an H100 GPU linked to the company’s new Grace CPU in the same package as if they were a single “superchip.”

Sometimes called “the Olympics of machine learning,” MLPerf consists of seven benchmark tests: image recognition, medical-imaging segmentation, object detection, speech recognition, natural-language processing, a new recommender system, and now an LLM. This set of benchmarks tested how well an already-trained neural network executed on different computer systems, a process called inferencing....

....MUCH MORE