ARTICLEAdaGrad : Introduction (No math!)

Covers: theory of AdaGrad
Questions this item adddesses:
  • What is Adagrad?
How to use this item?

This is the first module about the AdaGrad optimizer. This article explains AdaGrad in an​ easy way without any math behind this optimizer. This would be a good place to start learning about this optimizer



Amir PariziTotal time needed: ~2 hours
Learning Objectives
Learn the theory behind AdaGrad as an optimizer and how to implement it in Python
Potential Use Cases
Adagrad is an algorithm for gradient-based optimization. it is well-suited when dealing with sparse data (NLP or image recognition).
Target Audience
Go through the following annotated items in order:
ARTICLE 1. Intro to mathematical optimization
  • What is mathematical optimization?
  • Why do we need to optimize a cost function in ML algorithms?
10 minutes
VIDEO 2. Gradient Descent
  • What is Gradient Decent(GD)?
  • How does GD work in python?
10 minutes
ARTICLE 3. Learning Rate
  • What is learning rate?
  • How can I make it better?
20 minutes
ARTICLE 4. AdaGrad : Introduction (No math!)
  • What is Adagrad?
10 minutes
ARTICLE 5. Adaptive Gradient (adaGrad) : Introduction [ With more advanced math concepts ]
  • What is AdaGrad?
  • What is the math behind this optimizer?
30 minutes
ARTICLE 6. AdaGrad in Python
  • How to implement AdaGrad in Python
10 minutes
PAPER 7. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization (optional)
  • Where does this optimizer come from?
30 minutes

Concepts Convered