Research published this week by artificial intelligence lab OpenAI explains how an AI agent with a sense of curiosity outperformed its predecessors playing the classic 1984 Atari game Montezuma’s Revenge. Becoming skilled at Montezuma’s Revenge is not a milestone equivalent to beating Go or Dota 2, but it’s still a notable advance. When the Google-owned DeepMind published its seminal 2015 paper explaining how it beat a number of Atari games using deep learning, Montezuma’s Revenge was the only game it scored 0 percent on.
The reason for the game’s difficulty is a mismatch between the way it plays and the way AI agents learn, which also reveals a blind spot in machine learning’s view of the world.
Usually, AI agents rely on a training method called reinforcement learning to master video games. In this paradigm, agents are dumped into virtual world, and rewarded for some outcomes (like increasing their score) and penalized for others (like losing a life). The agent starts playing the game random, but learns to improve its strategy through trial and error. Reinforcement learning is often thought of as a key method for building smarter robots.
The problem with Montezuma’s Revenge is that it doesn’t provide regular rewards for the AI agent. It’s a puzzle-platformer where players have to explore an underground pyramid, dodging traps and enemies while collecting keys that unlock doors and special items. If you were training an AI agent to beat the game, you could reward it for staying alive and collecting keys, but how do you teach it to save certain keys for certain items, and use those items to overcome traps and complete the level?
The answer: curiosity.
Vincent goes on to recount that the OpenAI lab recast the reinforcement pattern by rewarding the AI to explore unknown parts of the pyramid, ultimately developing better-than-human performance. In one pay it completed the first nine levels in the game.
Vincent closes by asking the right question:
But why do we need curious AI in the first place? What good does it do us, apart from providing humorous parallels to our human tendency to get ensnared by random patterns
The big reason is that curiosity helps computers learn on their own.
This is why curiosity is perhaps the most critical skill for people in an uncertain and changing world, and so of course, that is naturally true for AI, as well. If we want to learn new ways of dealing with the world, new skills and techniques, the best starting point is curiosity.