Skip to content

Repository containing code and notebooks exploring how to solve Atari's Pong through Reinforcement Learning

Notifications You must be signed in to change notification settings

kuds/rl-atari-pong

Repository files navigation

Playing Atari's Pong with Reinforcement Learning

Deep Q Learning (DQN)

Proximal Policy Optimization (PPO)

Results

Hardware: Google Colab T4

Model Type Average Reward Training Time Total Training Steps
PPO 21.0 5:32:21 10,000,000
DQN 20.6 11:56:00 10,000,000

Training Notes

  • When training with Google Colab Notebooks with high memory option enabled, try not to exceed the buffer size 850,000 as you can run into memory issues
  • When training in more complex environments or using multiple simulated environments (n_evn > 1), DQN is very sensitive to the hyperparameter settings
  • Stable Baselines3 implementation of Soft Actor-Critic (SAC) only supports continuous action spaces and can not be used with Atari's Pong as it uses discrete actions
  • When using rllib, be mindful of your resources, as the training jobs might not start (always in pending status) if there are not enough CPUs or GPUs allocated

Blog Posts

About

Repository containing code and notebooks exploring how to solve Atari's Pong through Reinforcement Learning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published