RL Weekly 14: OpenAI Five and Berkeley Blue
Published
OpenAI Five wins 2-0 against OG eSports
What it is
OpenAI Five is a Dota 2 artificial intelligence developed by OpenAI. Previously, in The International 2018 (TI8), the most prestigious annual Dota 2 tournament, OpenAI Five lost against paiN gaming and an all-star team of former Chinese players. On April 13th, OpenAI Five played against OG eSports, a team that won TI8, and under a few restricted rules, it won 2-0.
OpenAI also showcased a demonstration match of two human players with OpenAI Five and another two human players with OpenAI Five, and announced “OpenAI Five Arena,” where players around the world can register to play with or against OpenAI Five.
Why it matters
There have been a few substantial “AI vs. Human” events. These include Deep Blue vs Kasparov (Chess), AlphaGo vs. Lee Sedol (Go), and AlphaStar vs. MaNa (StarCraft 2). StarCraft 2 and Dota 2 provide unique challenges. First, each game requires a few tens of thousands of actions, whereas a game of chess and Go ends after a few hundred moves. Similarly, compared to chess and Go, the dimensions of the action space and the observation space is massive. Finally, most of the map is covered by “fog of war”, so the state of the game is only partially observable to the agent. Despite such challenges, OpenAI Five showed great prowess in Dota 2.
It is worth noting how much computation was needed to train OpenAI Five. OpenAI estimated that OpenAI Five has played 45000 years of Dota 2, which is not the amount of computation most companies can afford. Even with this scale, the results some caveats. Dota II originally has 117 “heroes” that the players can choose from, but in this match only 17 heroes were allowed. Because each heroes are unique and a different combination of heroes could result in very different games, unrestricted pool of heroes would have exponentially increased the amount of training.
Read more
- OpenAI Five (OpenAI Post)
- OpenAI Five Finals (Twitch Replay)
- Greg Brockman’s Live Tweets (Twitter Thread)
- Miles Brundage’s analogy of AlphaGo and Five (Twitter Thread)
External Resources
- Mike Cook’s Live Tweets (Twitter Thread)
- OpenAI Five Dota2 Finals match livestream has begun (RL Subreddit)
- OpenAI 5 vs Dota 2 World Champions happening now! (ML Subreddit)
Berkeley releases BLUE robot
What it is
The Robot Learning Lab at UC Berkeley released BLUE: Berkeley robot for Learning in Unstructured Environments. Blue is a human scale arm with 7 degrees of freedom (DOF) and 2kg payload that costs less than $5000 if produced in volumes of over 1500.
Why it matters
Human-scale robots with medium-to-high degree of freedom are extremely expensive: similar robotic arms cost more than $30000. By defining “useful” bandwidth and payload metrics for a set of outlined tasks, the team was able to successfully trade off unneeded performance for these tasks and reduce costs.
Read more
- Project Blue (Official Website)
- Quasi-Direct Drive for Low-Cost Compliant Robotic Manipulation (ArXiv Preprint)
- Berkeley Open Arms (GitHub Organization)
Some more exciting news in RL:
- Researchers at the RLAI lab of University of Alberta proposed a model-based algorithm with expectation models.
- The whitepaper for Facebook’s AI Habitat was released.
- Researchers at UC Berkeley proposed a new meta-RL algorithm called Guided Meta-Policy Search.
- Researchers at DeepMind, OpenAI, UC Berkeley, and Google Brain proposed a search-tree-free method called the Policy Gradient Search as an alternative to Monte Carlo Tree Search (MCTS).
- The Exploration in RL workshop was announced to be part of ICML2019.