Throughout training, you will be able to monitor the training progress of your RL agent. You will be able to see metrics on Amazon CloudWatch to see how your agent is performing. The SageMaker notebook provides step-by-step instructions for visualizing these metrics locally within the notebook itself. You will be able to see graphs similar to below in the notebook:

This graph shows how the episode reward mean changes over the training time. This is the mean reward that the agent recieves per episode of training. With reinforcement learning, you want to be able to maximize the reward. Monitor this graph to see how your agent is performing and determine if you need to tune the model more.

You can also see the metrics in the AWS Management Console by following these steps:

  • Go to Amazon SageMaker in the AWS Management Console.

  • Click on Training jobs on the left-navigation pane.

  • Click on a training job that has either Stopped or Completed.

  • Scroll down to the Monitor section.

  • You will be able to see graphs like episode reward mean, episode reward max, as well as CPU utilization, memory utilization, and more.

Remember, you can check the leaderboards and the live video replay to see how other teams are doing!

Detailed, step-by-step instructions are in the SageMaker notebook called lunarlander-tutorial.ipynb