Saturday, May 24, 2025

"Does AI Progress Have a Speed Limit?"

Well, with Einstein back on top again, the speed of light would seem to constrain the upper end.

See -  CERN: Light Once Again Faster then Neutrinos, Problem May Have Been a Loose Cable
Don't you hate it when that happens? A bad connection and all of a sudden you're calling Einstein a moron? 

From asterisk magazine, issue 10:

A conversation about the factors that might slow down the pace of AI development, what could happen next, and whether we’ll be able to see it coming.

Ajeya Cotra: I'm interested in something you mentioned in your recent paper, AI as Normal Technology, and on Twitter — the idea that the external world puts a speed limit on AI development. With applications like self-driving cars or travel booking agents, you need to test them in real-world conditions so they encounter failures they can learn from. This creates a speed limit in two ways: first, you have to wait to collect that data, and second, people may be reluctant to be guinea pigs while the system works out its bugs. That's why we're currently creating narrower, more specialized products with extensive code to handle edge cases, rather than letting AI learn through trial and error in the real world. Is that a good summary of the concept?

Arvind Narayanan: That's mostly correct, with one clarification: I don't think we'll always need to manually code for edge cases. The handling of edge cases can be learned, but that learning will often need to happen within real organizations doing real tasks — and that's what imposes the speed limit.

Ajeya: So the current assumption is that, to be production-ready for applications like travel booking, you need significant training on that specific application. But what if continued AI capability development leads to better transfer learning from domains where data is easily collected to domains where it’s harder to collect — for instance, if meta-learning becomes highly effective with most training done on games, code, and other simulated domains? How plausible do you think that is, and what should we watch for to see if that’s where things are headed?

Arvind: Those are exactly the kinds of things we should watch for. If we can achieve significant generalization from games to more open-ended tasks, or if we can build truly convincing simulations, that would be very much a point against the speed limit thesis. And this relates to one of the things I was going to bring up, which is that we’re both fans of predictions, but perhaps in different ways. AI forecasting focuses on predictions with very clear, adjudicable yes/no answers. There are good reasons for this, but one downside is that we’re focusing on really, really narrow questions. Instead, we might want to predict at the level of worldviews — propose different perspectives, identify a collection of indicators and predictions, gather data, and use human judgment to assess how well reality matches the predictions that result from different worldviews. The big-picture claim that the external world puts (and will continue to put) a speed limit on AI development is one that isn’t precise enough that you could turn it into a Metaculus question, but that doesn’t mean it isn’t testable. It just needs more work to test, and some human judgment.

Ajeya: In that case, we should talk about observations that might distinguish your worldview and mine. I think transfer learning, which I just mentioned, is one of those things: how well does training on easy-to-collect data — games, simulated environments, internal deployment — transfer to messier, real world tasks like booking flights? Or, more relevant to my threat models, how well does it transfer to things like surviving in the wild, making money, and avoiding being shut down? And how would we know? We don't have great meta-learning or transfer benchmarks because they're inherently difficult to construct — to test transfer from the training distribution, we need to know what the AI is trained on, and we don’t have that information. Do you have thoughts on what we might observe in 2025 that would suggest this transfer or meta-learning is working much better than expected?

Arvind: You pointed out the dimension of easy- versus hard-to-collect data. I'd add a related but distinct dimension that's very important to me: low- versus high-cost of errors. This explains a lot of the gaps we're seeing between where agents are and aren't working effectively. From my experience, the various deep research tools that have been released are more useful than a web agent for shopping, say, because OpenAI’s Deep Research, though agentic, is solving a generative task rather than taking costly real-world actions.

A major theme in our work is how hard it is to capture this with benchmarks alone. People tend to look at two extremes of evaluation: pure benchmarks or pure vibes. There's a huge space in the middle we should develop. Uplift studies are one example — giving some people access to a tool and others not — but there's enormous room for innovation there. A lot of what my group is doing with the Science of Agent Evaluation project is to figure out how to measure reliability as a separate dimension from capability.

Ajeya: I'm kind of interested in getting a sneak peek at the future by creating an agent that can do some task, but too slowly and expensively to be commercially viable. I'm curious if your view would change if a small engineering team could create an agent with the reliability needed for something like shopping or planning a wedding, but it's not commercially viable because it's expensive and takes too long on individual actions, needing to triple-check everything....

If the CERN neutrino had in fact been faster than the speed of light Ajeya's wish for a sneak peek of the future would be possible because the neutrino would have had to come from the future, making time travel easy-peasy. 

On the mortality of internet hyperlinks, a quick check of the link to CERN in the 2012 post we used as introduction at the top of this page shows that it has rotted and was not saved in time by the Internet Archive.

However, CERN's site has the original 2011 report under a series of updates, the most recent being:

UPDATE 8 June 2012
Neutrinos sent from CERN to Gran Sasso respect the cosmic speed limit