A conversation about the factors that might slow down the pace of
AI development, what could happen next, and whether we’ll be able to
see it coming.
Ajeya Cotra: I'm interested in something you mentioned in your recent paper, AI as Normal Technology,
and on Twitter — the idea that the external world puts a speed limit on
AI development. With applications like self-driving cars or travel
booking agents, you need to test them in real-world conditions so they
encounter failures they can learn from. This creates a speed limit in
two ways: first, you have to wait to collect that data, and second,
people may be reluctant to be guinea pigs while the system works out its
bugs. That's why we're currently creating narrower, more specialized
products with extensive code to handle edge cases, rather than letting
AI learn through trial and error in the real world. Is that a good
summary of the concept?
Arvind Narayanan: That's
mostly correct, with one clarification: I don't think we'll always need
to manually code for edge cases. The handling of edge cases can be
learned, but that learning will often need to happen within real
organizations doing real tasks — and that's what imposes the speed
limit.
Ajeya: So
the current assumption is that, to be production-ready for applications
like travel booking, you need significant training on that specific
application. But what if continued AI capability development leads to
better transfer learning from domains where data is easily collected to
domains where it’s harder to collect — for instance, if meta-learning
becomes highly effective with most training done on games, code, and
other simulated domains? How plausible do you think that is, and what
should we watch for to see if that’s where things are headed?
Arvind: Those
are exactly the kinds of things we should watch for. If we can achieve
significant generalization from games to more open-ended tasks, or if we
can build truly convincing simulations, that would be very much a point
against the speed limit thesis. And this relates to one of the things I
was going to bring up, which is that we’re both fans of predictions,
but perhaps in different ways. AI forecasting focuses on predictions
with very clear, adjudicable yes/no answers. There are good reasons for
this, but one downside is that we’re focusing on really, really narrow
questions. Instead, we might want to predict at the level of worldviews —
propose different perspectives, identify a collection of indicators and
predictions, gather data, and use human judgment to assess how well
reality matches the predictions that result from different worldviews.
The big-picture claim that the external world puts (and will continue to
put) a speed limit on AI development is one that isn’t precise enough
that you could turn it into a Metaculus question, but that doesn’t mean
it isn’t testable. It just needs more work to test, and some human
judgment.
Ajeya: In that case, we should talk
about observations that might distinguish your worldview and mine. I
think transfer learning, which I just mentioned, is one of those things:
how well does training on easy-to-collect data — games, simulated
environments, internal deployment — transfer to messier, real world
tasks like booking flights? Or, more relevant to my threat models, how
well does it transfer to things like surviving in the wild, making
money, and avoiding being shut down? And how would we know? We don't
have great meta-learning or transfer benchmarks because they're
inherently difficult to construct — to test transfer from the training
distribution, we need to know what the AI is trained on, and we don’t
have that information. Do you have thoughts on what we might observe in
2025 that would suggest this transfer or meta-learning is working much
better than expected?
Arvind: You pointed out the
dimension of easy- versus hard-to-collect data. I'd add a related but
distinct dimension that's very important to me: low- versus high-cost of
errors. This explains a lot of the gaps we're seeing between where
agents are and aren't working effectively. From my experience, the
various deep research tools that have been released are more useful than
a web agent for shopping, say, because OpenAI’s Deep Research, though
agentic, is solving a generative task rather than taking costly
real-world actions.
A major theme in our work is how hard it is to
capture this with benchmarks alone. People tend to look at two extremes
of evaluation: pure benchmarks or pure vibes. There's a huge space in
the middle we should develop. Uplift studies are one example — giving
some people access to a tool and others not — but there's enormous room
for innovation there. A lot of what my group is doing with the Science of Agent Evaluation project is to figure out how to measure reliability as a separate dimension from capability.
Ajeya:
I'm kind of interested in getting a sneak peek at the future by
creating an agent that can do some task, but too slowly and expensively
to be commercially viable. I'm curious if your view would change if a
small engineering team could create an agent with the reliability needed
for something like shopping or planning a wedding, but it's not
commercially viable because it's expensive and takes too long on
individual actions, needing to triple-check everything....
If the CERN neutrino had in fact been faster than the speed of light Ajeya's wish for a sneak peek of the future would be possible because the neutrino would have had to come from the future, making time travel easy-peasy.
On the mortality of internet hyperlinks, a quick check of the link to CERN in the 2012 post we used as introduction at the top of this page shows that it has rotted and was not saved in time by the Internet Archive.
However, CERN's site has the original 2011 report under a series of updates, the most recent being: