Sunday, May 4, 2025

A.I.: "Claude Is 24% of the Way to Stealing Your Job"

From AGI Friday, May 2:

Not actually, but it did hit 24% on a new stealing-your-job benchmark

Today, via friend of the newsletter Ray Sarraga who asked in the comments of last week’s AGI Friday, I’d like to react to a spate of news articles mocking some AI research out of Carnegie Mellon University. Here are the headlines:

  • “Silicon Valley’s Biggest Comedy Show Yet: AI Tries (And Fails) To Run A Company”

  • “A Fake Company Staffed Only With AI Agents Was a Total Disaster”

And so on. They call it the nail in the coffin for claims that AI is on track to steal your job.

(This all reminded me that 11 years ago I wrote a post called “Welcome, Job-Destroying Robots” which has aged… interestingly. It explicitly set aside the question of AGI and just ranted about people being confused about how economics works. I guess I still stand by it. Just that I no longer think AGI is so far off. Another 11 years or so might be about right.)

So I don’t want to just mock these “lol look how dumb AI is” articles. They have value in counterbalancing the hype. But this actually gets at the core of what I'm hoping to convey with AGI Friday. The hypesters and the pooh-poohers are both deeply wrong. Depending on how the future plays out, one or the other group will be able to pretend they knew it all along.1 But it's kind of a coin toss, depending on the timeframe. If we take 2030 as the cutoff then I personally think the pooh-poohers have the edge. But when the articles say “the machines aren't coming for your job anytime soon” that sure sounds like it means more than a 5-year horizon. Your job is very safe this year and probably safe this decade. Beyond that, literally (almost literally “literally”) anything is possible.

Having said all that, these articles do seem a bit dumb and disingenuous. It’s perfectly predictable how a fake company experiment, the way these articles describe it, would go. It’s like putting a bunch of toasters and waffle irons in an empty building and gloating that they failed to start a viable bistro. The next prediction from those of us worried about the trajectory towards AGI is that by the end of 2025 we'll have the first so-called agents that are actually useful, that can go out and do specific tasks for you on the internet. If even that fails to happen, that'll be the first clue to lengthen our AGI timelines.

And final note, the news articles are dumb but the research they think they’re making fun of is great. The authors have built a new benchmark for measuring progress towards bonafide job-stealing. Not that Claude’s high score of 24% means very much yet. Claude can do 24% of the somewhat contrived tasks in the benchmark but the authors don't claim that hitting 100% is sufficient (or even necessary) for AGI....

....MORE