Friday, July 28, 2023

"AI and the paperclip problem"

From VoxEU - CEPR, June 10, 2018:

Philosophers have speculated that an AI tasked with a task such as creating paperclips might cause an apocalypse by learning to divert ever-increasing resources to the task, and then learning how to resist our attempts to turn it off. But this column argues that, to do this, the paperclip-making AI would need to create another AI that could acquire power both over humans and over itself, and so it would self-regulate to prevent this outcome. Humans who create AIs with the goal of acquiring power may be a greater existential threat.

The notion that artificial intelligence (AI) may lead the world into a paperclip apocalypse has received a surprising amount of attention. It motivated Stephen Hawking and Elon Musk to express concern about the existential threat of AI. It has even led to a popular iPhone game explaining the concept. 

The concern isn’t about paperclips per se. Instead it is that, at some point, switching on an AI may lead to destruction of everything, and that this destruction would both be easy and arise from a trivial or innocuous initial intent.

The underlying ideas behind the notion that we could lose control over an AI are profoundly economic. But, to date, economists have not paid much attention to them. Instead, their focus has been on the more mundane, recent improvements in machine learning (Agrawal 2018). Taking a more future-bound perspective, my research (Gans 2017) shows that for a paperclip apocalypse to occur, we must make important underlying assumptions. This gives me reason to believe that it's less likely than non-economists believe that the world will end this way.

What is the paperclip apocalypse?
The notion arises from a thought experiment by Nick Bostrom (2014), a philosopher at the University of Oxford. Bostrom was examining the 'control problem': how can humans control a super-intelligent AI even when the AI is orders of magnitude smarter. Bostrom's thought experiment goes like this: suppose that someone programs and switches on an AI that has the goal of producing paperclips. The AI is given the ability to learn, so that it can invent ways to achieve its goal better. As the AI is super-intelligent, if there is a way of turning something into paperclips, it will find it. It will want to secure resources for that purpose. The AI is single-minded and more ingenious than any person, so it will appropriate resources from all other activities. Soon, the world will be inundated with paperclips.

It gets worse. We might want to stop this AI. But it is single-minded and would realise that this would subvert its goal. Consequently, the AI would become focussed on its own survival. It is fighting humans for resources, but now it will want to fight humans because they are a threat (think The Terminator). 

This AI is much smarter than us, so it is likely to win that battle. We have a situation in which an engineer has switched on an AI for a simple task but, because the AI expanded its capabilities through its capacity for self-improvement, it has innovated to better produce paperclips, and developed power to appropriate the resources it needs, and ultimately to preserve its own existence.

Bostrom argues that it would be difficult to control a super-intelligent AI – in essence, better intelligence beats weaker intelligence....


Possibly related, July 7, 2023:

Puny Human, I Scoff At Your AI, Soon You Will Know The Power Of Artificial 'Super' Intelligence. Tremble and Weep
[insert 'Bwa Ha Ha' here, if desired]