Covers: theory of AdaGrad

0- What is Adagrad?

This is the first module about the AdaGrad optimizer. This article explains AdaGrad in an easy way without any math behind this optimizer. This would be a good place to start learning about this optimizer

- Objectives
- Learn the theory behind AdaGrad as an optimizer and how to implement it in Python
- Potential Use Cases
- Adagrad is an algorithm for gradient-based optimization. it is well-suited when dealing with sparse data (NLP or image recognition).
- Who is This For ?
- INTERMEDIATE

ARTICLE 1. Intro to mathematical optimization

- What is mathematical optimization?
- Why do we need to optimize a cost function in ML algorithms?

10 minutes

VIDEO 2. Gradient Descent

- What is Gradient Decent(GD)?
- How does GD work in python?

10 minutes

ARTICLE 3. Learning Rate

- What is learning rate?
- How can I make it better?

20 minutes

ARTICLE 4. AdaGrad : Introduction (No math!)

- What is Adagrad?

10 minutes

ARTICLE 5. Adaptive Gradient (adaGrad) : Introduction [ With more advanced math concepts ]

- What is AdaGrad?
- What is the math behind this optimizer?

30 minutes

ARTICLE 6. AdaGrad in Python

- How to implement AdaGrad in Python

10 minutes

PAPER 7. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization (optional)

- Where does this optimizer come from?

30 minutes

