Atari James Bond Environment
Overview
The player controls the titular character of James Bond across four levels. The player is given a multi-purpose vehicle that acts as an automobile, a plane, and a submarine. The vehicle can fire shots and flare bombs, and travels from left to right as the player progresses through each level. The player can shoot or avoid enemies and obstacles that appear throughout the game, including boats, frogmen, helicopters, missiles, and mini-submarines.
Description from Wikipedia
Performances of RL Agents
We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!
Human Starts
Result | Algorithm | Source |
---|---|---|
3961.0 | Prioritized DDQN (rank, tuned) | Prioritized Experience Replay |
3511.5 | Prioritized DDQN (prop, tuned) | Prioritized Experience Replay |
1074.5 | Prioritized DQN (rank) | Prioritized Experience Replay |
835.5 | DuDQN | Dueling Network Architectures for Deep Reinforcement Learning |
613.0 | A3C LSTM | Asynchronous Methods for Deep Reinforcement Learning |
585.0 | PDD DQN | Dueling Network Architectures for Deep Reinforcement Learning |
573.0 | DDQN (tuned) | Deep Reinforcement Learning with Double Q-learning |
541.0 | A3C FF | Asynchronous Methods for Deep Reinforcement Learning |
444.0 | Gorila DQN | Massively Parallel Methods for Deep Reinforcement Learning |
416.0 | DDQN | Deep Reinforcement Learning with Double Q-learning |
368.5 | Human | Massively Parallel Methods for Deep Reinforcement Learning |
351.5 | A3C FF 1 day | Asynchronous Methods for Deep Reinforcement Learning |
348.5 | DQN | Massively Parallel Methods for Deep Reinforcement Learning |
33.5 | Random | Massively Parallel Methods for Deep Reinforcement Learning |
No-op Starts
Normal Starts
Result | Algorithm | Source |
---|---|---|
560.7 | PPO | Proximal Policy Optimization Algorithm |
261.8 | ACER | Proximal Policy Optimization Algorithm |
52.3 | A2C | Proximal Policy Optimization Algorithm |