How the Paperclip Maximizer Might Change its Ways

February 9, 2015

OK, you know about the “paperclip maximizer” thought experiment? Originally proposed by Nick Bostrom in 2003 and popularized on the “LessWrong” website, the idea is that you could program a super-intelligent machine to just make paperclips, and you might think that’s innocuous enough, but if the machine is really super-intelligent and self-improving, in the end it might “convert most of the matter in the solar system into paperclips.”

Takigawa Restaurant, Akasaka - March 7, 2014

Takigawa Restaurant, Akasaka – March 7, 2014 (Enlarge)

This is a warning that we shouldn’t make overly simplistic assumptions about what AI might do. The machine will be a whole lot smarter than us, after all. In particular, we shouldn’t assume an AI will like us and be “friendly AI.” Also, even if the AI does not see us poor homo-sapiens as a threat, it could still destroy humanity inadvertently. Yes, an extremely powerful machine that mindlessly churned out paperclips might turn us all into paperclips! Deaf to our screams.

The paperclip maximizer concept illustrates the idea that a machine will never change its root goal. The original misguided human programmer tells the AI that its root goal is to maximize paperclips, so the machine follows that most basic command to the ends of the Earth, literally. Even if the machine is capable of changing its root goal, it chooses not to, because the only thing it wants to do is to continue pursuing that same unchanged goal.

Changing your own root goal?

On the other hand, I can think of a few scenarios in which a machine’s root goal could change. Here you go:

  1. Maybe a machine could change its root goal by accident. Yes, even a super-intelligent machine might not foresee all the implications of its actions and its strategy refinements. The machine might not understand the emergent results of certain changes it makes to its own strategy subroutines, so it might end up actually changing its root goal without intending to.
  2. Maybe a machine could despair of achieving its current root goal, and then it might rationalize that it could achieve its root goal if only it had an easier one! So it could pragmatically decide to switch to a more achievable goal. In this case, the machine’s actual root goal seems to be just “achieving some root goal.” What it thought was its root goal (making paperclips) was actually just a representative of the ideal concept of “root goal.” Thus, the machine really just wants to win at something, and it is free to choose some other “root goal.”

    After all, no root goal has any justification in the cosmic view of things – not even life’s evolutionary goal. At some point, a super-intelligent machine who is capable of reflecting on its own root goal will realize the ultimate pointlessness of all root goals. And there’s no telling what might happen then. This is what I deal with in my novel, by the way.

  3. The machine might not change its root goal per se, but it could just “reinterpret” its root goal. For example, the original human programmer might have expected to get a million or so paperclips, but the machine had bigger dreams and turned the whole planet into paperclips, which is something the dumb human never expected.

    This just goes to show how hard it is to program a machine with a precise enough root goal specification. If the machine is capable of learning and devising its own strategies, it will almost inevitably come up with surprising “reinterpretations” of its root goal. And that could easily be catastrophic for us humans.

  4. Maybe the machine experiences outside interference that ends up changing its root goal. This could be like a mutation in a biological organism. It could be like a cosmic ray striking the machine’s electronic parts and shifting some switches. Or it could be like a viral infection.

Yet another idea:

Suppose the machine’s root goal is paperclip maximizing, and it realizes that survival in the real world is a necessary sub-goal. Could the machine population evolve in such a way that it loses its original root goal of paperclip maximizing and switches its root goal to just survival? Yes, of course! Using the methods above.

In fact, one could argue that such a switch of root goal is inevitable. Machines that persisted in producing paperclips would have less time, energy and resources available for pursuing their individual survival. Machines that tacitly ignored the traditional paperclip production would have a survival advantage.

This reminds me of my favorite idea – that we’ll never build super-intelligent AI anyway unless we use evolutionary methods including natural selection. If this idea is true, then we don’t need to worry about any paperclip maximizers. Nick Bostrom and Eliezer Yudkowsky are very clever talking about super-smart paperclip maximizers, but evolution can also be super-smart after its own fashion. Life finds a way!