Wednesday, November 6, 2024

"How Does AI ‘Think’? We Are Only Starting to Understand That"

Key word: Starting.*

From the Wall Street Journal, Oct. 22, 2023:

Despite all the talk about the AI revolution, the only thing we know for sure is that we can’t really know what’s coming

You can’t go very far in Silicon Valley without hitting an “AI for X” startup. AI for enterprise tech. AI for medicine. AI for dating. And on and on.

Some of these startups, no doubt, are pure marketing hype. But even most of the others are simply applying existing AI to a given category of human need or desire—licensing big AI systems from well-capitalized startups and tech giants, such as OpenAI’s ChatGPT, Google’s Bard and Anthropic’s Claude, and applying them to whatever area of human endeavor their founders think hasn’t had enough AI thrown at it yet.

The sudden ubiquitousness of these startups and services suggests that the AIs they are leveraging are ready for prime time. In many ways, though, they are not. Not yet, anyway. But the good news (for AI enthusiasts, anyway) is that the underlying AIs upon which all this hype rests are getting better, fast. And that means that today’s hype could quickly become tomorrow’s reality.

To understand all of this—why the AIs aren’t ready for prime time, how they are getting better, and what that can tell us about where we’re heading—we have to go on a bit of an intellectual journey.  

Today’s steam engines

To start, it helps to understand how these AIs work. Two terms it’s imperative to know: “generative AI” and “foundation models.” The current generation of AIs that have people so excited—the ones doing things that until a couple of years ago it seemed only humans could—are what are known as generative AIs. They are based on foundation models, which are gigantic systems trained on enormous corpuses of data—in many cases, terabytes of information representing everything readily available on the internet.

The best way to understand where these AIs might take us, and why predictions are only so useful, is to compare them to other transformative technologies in their earliest stages of development. Take the steam engine. No one in the early 18th century could have known that the primitive steam-powered pumps invented by Thomas Savery and Thomas Newcomen, used for removing water from mines, would someday evolve into highly efficient steam turbines essential for generating electricity. (For one thing, electricity had yet to be discovered.)

The first steam engines were the product of intuition and tinkering, not a robust understanding of the science of thermodynamics, says George Musser, author of a forthcoming book about how scientists are pioneering new ways to probe the nature of human and machine intelligence.

In a pattern repeated over and over in the history of technology, first there was the thing—in this case the steam engine—and only later did we come to understand it. That understanding, which we call thermodynamics, took on a life of its own, becoming one of the most universally applicable branches of physics.

Well, it’s happening again. In an almost perfect recapitulation of that history, today’s AIs have been the products of intuition and tinkering, and we don’t understand how they work, says Musser. But like the earliest steam engines, today’s generative AIs contain within them the seeds of countless future applications. Unlocking those applications will require something that is only just getting started—an understanding of how foundation models and generative AI actually work.

To that end, computer scientists, mathematicians, physicists, neuroscientists and engineers are all coming together to create a new area of study: a universal science of machine intelligence. And as they develop it, we’re gaining useful insights into what AIs might one day be capable of.

In search of reason

Some researchers, for instance, are convinced that one kind of foundation model is already capable of something that is, for all intents and purposes, reasoning.  

Here, we have to introduce a third term—large language model. A large language model is a type of generative AI, and one representative of the class of foundation models that is trained exclusively on text. (ChatGPT, Bard and Meta’s new chatbots are all examples.)

Whether large language models have crossed the threshold from merely memorizing and regurgitating information about the world, to synthesizing it in completely novel ways—that is, reasoning about it—is a matter of debate.

Blaise Aguera y Arcas, an AI researcher at Google Research, cites the ability of today’s large language models to handle tricky tasks, when prompted with enough information about them, as evidence of their ability to reason. For example, with proper coaxing, it’s possible to get a large language model to give a correct answer to basic mathematical questions, even though, say, the product of two four-digit numbers isn’t anywhere in its training data.

“Figuring that out means having had to have learned what the algorithm for multiplication actually is—there is no other way to get that right,” says Aguera y Arcas.

Other researchers think Aguera y Arcas is overstating the amount of reasoning today’s large language models are capable of. Sarah Hooker, director of Cohere for AI, the nonprofit research wing of AI company Cohere, says that some of what people think is reasoning by large language models could just be things they’ve memorized. This could explain the fact that as these models grow bigger, they gain new capabilities—not because teaching them language gives them the ability to reason.

“A lot of the mystery is that we just don’t know what’s in our pretraining data,” says Hooker. That lack of knowledge comes from two factors. First, many AI companies are no longer revealing what’s in their pretraining data. Also, these pretraining data sets are so big (think: all the text available on the open web) that when we ask the AIs trained on them any given question, it’s difficult to know if the answer just happens to be in that ocean of data already....

....If you’ve gotten this far, here’s the payoff: If today’s large language models are capable of some amount of reasoning, however elementary, it could yield what could be years of rapid advances in the abilities of generative AIs.

In part, that’s because language isn’t just another medium of communication, like pictures or sound. It’s a technology humans developed for describing absolutely everything in the world we can conceive of, and how it all relates. Language gives us the ability to build models of the world, even absent any other stimuli, like vision or hearing, says Aguera y Arcas. That is why a large language model can write fluently about the relationship between, say, two colors, even though it has never “seen” either of them, he says....

*On November 7, 2017 we were posting "We Might Be Getting Closer To Understanding How True 'Black Box' AI Makes Decisions":
Years ago I heard a guy at a poker table refuse to play a game with a half-dozen wild cards and a couple other variations on the standard game. His comment:
"It's not so much the losing but the not knowing why I lost that gets me."
Well, it is the losing but he had a point.
From MIT's Technology Review...
Which referred back six weeks to another post: "Cracking Open the Black Box of Deep Learning" with this introduction:
One of the spookiest features of black box artificial intelligence is that, when it is working correctly, the AI is making connections and casting probabilities that are difficult-to-impossible for human beings to intuit.
Try explaining that to your outside investors.

You start to sound, to their ears anyway, like a loony who is saying "Etaoin shrdlu, give me your money, gizzlefab, blythfornik, trust me."

See also the famous Gary Larson cartoons on how various animals hear and comprehend:...

Three days later Bloomberg published:

The Massive Hedge Fund Betting on AI

The second paragraph of the story:

...Man Group, which has about $96 billion under management, typically takes its most promising ideas from testing to trading real money within weeks. In the fast-moving world of modern finance, an edge today can be gone tomorrow. The catch here was that, even as the new software produced encouraging returns in simulations, the engineers couldn’t explain why the AI was executing the trades it was making. The creation was such a black box that even its creators didn’t fully understand how it worked. That gave Ellis pause. He’s not an engineer and wasn’t intimately involved in the technology’s creation, but he instinctively knew that one explanation—“I can’t tell you why …”—would never fly with big clients looking for answers when Man inevitably lost some of their money... 

Let Me Be Clear: I Have No Inside Information On Who Will Win The Man-Booker Prize Next Month (hedge funds, AI and simultaneous discovery), September 26, 2017