From the Wall Street Journal via MSN, December 1:
Most of the worries about an AI bubble involve investments in businesses that built their large language models and other forms of generative AI on the concept of the transformer, an innovative type of neural network that eight years ago laid the foundations for the current boom.
But behind the scenes, artificial-intelligence researchers are pushing into new approaches that could pack an even bigger payoff.
One early-stage startup developing a transformer alternative, Palo Alto, Calif.-based Pathway, plans to announce Monday that its “Dragon Hatchling” architecture now runs on Nvidia AI infrastructure and Amazon Web Services’ cloud and AI tech stack.
The company has shipped Dragon Hatchling architecture, but doesn’t plan to release the commercial models trained on it until next year. Once that happens, its Nvidia and AWS compatibility means companies would be able to put it into production “the next day,” Pathway said.
Dragon Hatchling imbues AI with memory that large language models can’t match, according to Pathway, theoretically enabling a new class of continuously learning, adaptive AI systems. The company also casts its approach as a potentially faster way to get to artificial general intelligence, which some people describe as similar to human-level cognitive ability.
The company isn’t alone in this quest. It regards large and well-established Anthropic as its biggest obstacle. It faces other challenges, too, such as convincing potential users who have just learned one set of AI vocabulary and skills that they should adopt something new.
Regardless of whether Pathway fulfills its ambitions, it will at least get a chance to make its case to the market. Its arrival also reinforces the intense scientific effort driving AI forward, even as big deals, big valuations and big personalities command the attention.
‘Equations of reasoning’
“This is just fun, right?” said Zuzanna Stamirowska, co-founder and chief executive officer at Pathway, when I met with her and another member of the team at Wall Street Journal headquarters in November. She was enthusing about Pathway’s approach, likening it to scientists’ discovery of thermodynamics, which accelerated the Industrial Revolution by shifting society from simply building engines to understanding the laws of heat and energy that govern them.
Pathway has identified what Stamirowska calls equations of reasoning, fundamental mathematical axioms that explain how intelligence emerges from smaller, local interactions in the brain, she said. That means it can explain how and why intelligence works, rather than just observing that it does, which has been a struggle with transformer-based models.
That also helps Pathway address large language models’ typical limits when it comes to building on previous interactions, by strengthening or weakening synapses over time according to their use, said Stamirowska, who holds a Ph.D. in complex systems and has published research on emergent behavior in dynamic networks. She has also received France’s i-Lab innovation prize and been called one of “100 geniuses whose innovation will change the world” by the magazine Le Point.
“Memory is key to intelligence and efficient reasoning,” Stamirowska said....
....MUCH MORE
Digging into the link-vault on transformer architecture which Google invented:
Google Research, August 31, 2017 - Transformer: A Novel Neural Network Architecture for Language Understanding
Nvidia's blog March 25, 2022 - What Is a Transformer Model?
Techspot December 24, 2024 - Meet Transformers: The Google Breakthrough that Rewrote AI's Roadmap
How Attention Replaced Recurrence and Changed the Rules of AI
And a couple of our posts:
February 2024 - IEEE "Spectrum explains large language models, the transformer architecture, and how it all works"
November 2024 - While Google Slept: ChatGPT and Transformers