- Observational overfitting: Agent overfits due to properties of the observation irrelevant to the latent dynamics of the MDP.
- Effect: This could hinder generalization.
- Evidence 1: Scoreboard and background objects is highlighted red in the saliency map.
- Evidence 2: Covering the scoreboard with a black rectangle during training resulted in a 10% increased test performance.
- Solution?: Overparametrizing can help as a form of “implicit regularization.”, improving generalization to test set.