Covers: theory of Momentum

0- What is momentum in GD?
- How does momentum help optimizer to perform faster?

In this third module, we are going to learn about gradient descent with momentum as one of the main components of Adam optimizer. Andrew explains this concept with the math behind this concept.

- Objectives
- Learn the theory behind Adam as an optimizer and how to implement it in Python
- Potential Use Cases
- The algorithms leverages the power of adaptive learning rates methods to find individual learning rates for each parameter.
- Who is This For ?
- INTERMEDIATE

ARTICLE 1. Intro to mathematical optimization

- What is mathematical optimization?
- Why do we need to optimize a cost function in ML algorithms?

10 minutes

ARTICLE 2. RMSprop

- What is RMSprop?
- How does this algorithm work?

10 minutes

VIDEO 3. Gradient Descent with Momentum

- What is momentum in GD?
- How does momentum help optimizer to perform faster?

10 minutes

VIDEO 4. Adam optimization algorithm

- What is Adam optimizer?
- What is the math behind this optimizer and how does adam work?

10 minutes

ARTICLE 5. Adam optimizer [more advanced math concepts behind this algorithm]

- What is Adam optimizer?

25 minutes

LIBRARY 6. Implementing Adam optimizer in Python using Keras

- How to implement Adam optimizer in Python?

20 minutes

PAPER 7. ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION

- Where does this optimizer come from?

20 minutes

