Covers: theory of Adam

- Where does this optimizer come from?

This is the optional module for the students who want to learn more about Adam. This is the first article that introduces this concept.

Amir Parizi**Total time needed: **~2 hours

- Learning Objectives
- Learn the theory behind Adam as an optimizer and how to implement it in Python
- Potential Use Cases
- The algorithms leverages the power of adaptive learning rates methods to find individual learning rates for each parameter.
- Target Audience
- INTERMEDIATE

Go through the following **annotated items** *in order*:

ARTICLE 1. Intro to mathematical optimization

- What is mathematical optimization?
- Why do we need to optimize a cost function in ML algorithms?

10 minutes

ARTICLE 2. RMSprop

- What is RMSprop?
- How does this algorithm work?

10 minutes

VIDEO 3. Gradient Descent with Momentum

- What is momentum in GD?
- How does momentum help optimizer to perform faster?

10 minutes

VIDEO 4. Adam optimization algorithm

- What is Adam optimizer?
- What is the math behind this optimizer and how does adam work?

10 minutes

ARTICLE 5. Adam optimizer [more advanced math concepts behind this algorithm]

- What is Adam optimizer?

25 minutes

LIBRARY 6. Implementing Adam optimizer in Python using Keras

- How to implement Adam optimizer in Python?

20 minutes

PAPER 7. ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION

20 minutes

