Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Exercise 05

In this exercise we will revisit the included racetrack_environment to have a look at temporal difference (TD) algorithms.

Tasks:

  1. policy evaluation using TD learning
  2. on-policy epsilon-greedy control using TD learning
  3. off-policy epsilon-greedy control using TD learning → Q-learning
  4. using double Q-learning in stochastic environments