From Puck, August 19:
While others have abandoned the pursuit of artificial general intelligence, Amazon’s AGI Labs is still chasing the dream. Cognitive scientist Danielle Perszyk explains how they’re trying to build advanced, reliable agents—and ruminates on the existential implications if they succeed.
Dr. Danielle Perszyk, a cognitive scientist at Amazon’s AGI Labs, can’t help but feel anxious every time she opens her computer, where she is greeted by all the usual pop-up distractions: email alerts, Slack messages, social media updates, “productivity” app notifications. Of course, her frustration—which has to do with these obstructive detours between thinking and interfacing—was hardly evident in the 45 minutes we spent chatting about her research. But it’s certainly one of the factors driving her interest in A.I. “I think that part of the solution is making the tech go more in the background,” she told me, noting the irony of developing new technologies designed to make older technologies less obtrusive.Amazon’s AGI Labs hasn’t been around for long: It was launched at the end of 2024, after the tech behemoth hired away the top executives at A.I. startup Adept. The team is largely focused on long-term research bets centered around the creation of so-called agentic (action-taking) systems that can help mitigate the negative cognitive impacts of all the bullshit clicking and scrolling that suppress higher-level thinking and creativity. “The technology that we’re building allows all of this existing structure to be masked, essentially, by creating what will ultimately be a meta interface,” she explained, describing a generative user interface “where you’re starting from the goal that the human has, and the model understands what that goal is, and it will put everything else on silent and only give you the button that you need to click in this moment.” In other words: an interactive focusing tool that would just work.To achieve this lofty vision, the lab is attempting to build something like an artificial general intelligence, which is among the more controversial end goals of the A.I. race. The idea—some would say fantasy—is to build models that actually think like humans, and perhaps eventually replace all human labor. Earlier this year, there was excitement across Silicon Valley that we might be at the precipice of A.G.I.; more recently, there’s been growing acknowledgement that large language models alone are likely not up to the task. Sam Altman, for instance, has backed away from his earlier suggestions that GPT-5 would represent a meaningful step toward A.G.I., and now dismisses A.G.I. as an unrealistic near-term target.Perszyk’s lab, however, is defining A.G.I. a bit differently—as a system that can do everything a human can do, but on a computer. Their goal is not to create models that can replace human labor. Instead, it’s to build systems that understand how we think, and can therefore enhance our capabilities through “useful agents.” In March, the team unveiled Nova Act, a research preview for a model designed to perform actions within a web browser—a stepping stone toward the “meta interface” she described to me. The system, according to Amazon, achieved 90 percent reliability rates across specific enterprise use cases. Months earlier, Anthropic and OpenAI launched iterations of this same concept in the form of Computer Use and Operator, respectively, but so far, the trouble with those “agents” has been reliability: Like all L.L.M.s, they’re error-prone, and when you’re dealing with actions, rather than words, the impact of those mistakes can be a lot worse. But Perszyk believes her lab has struck upon an approach that is much more commercially viable.Deconstruction
Obviously, though, there’s a long road ahead. As Perszyk explained, the “ingredients that go into the breakthroughs for L.L.M.s are categorically different from agents. It’s not even starting with the foundation of the L.L.M.s that we all take for granted. It’s a whole different beast.” In short, since agentic products require the model to understand actions as well as words, those systems must be trained on “a much more diverse range of skills and human interactions,” Perszyk said, describing her lab’s approach as the only path to generalization—the G in A.G.I....
....MUCH MORE