Virtual Artificial Intelligent Agents Use Previously Unknown Strategies for Simulated Hide and Seek Game
In their quest to “ensure that artificial general intelligence benefits all of humanity”, researchers at OpenAI discovered that their virtual AI agents engaged in a simulated game of “Hide and Seek” had learned six distinct new strategies for playing the game outside of the designed environment.
Through training in our new simulated hide-and-seek environment, agents build a series of six distinct strategies and counterstrategies, some of which we did not know our environment supported. The self-supervised emergent complexity in this simple environment further suggests that multi-agent co-adaptation may one day produce extremely complex and intelligent behavior.
This surprising type of learning behavior suggests intrinsic motivation and/or competition with others, although the researchers are still figuring it all out.
We’ve shown that agents can learn sophisticated tool use in a high fidelity physics simulator; however, there were many lessons learned along the way to this result. Building environments is not easy and it is quite often the case that agents find a way to exploit the environment you build or the physics engine in an unintended way.