Covers: theory of Gradient Descent
Questions this item addresses:
  • What is Gradient Decent(GD)?
  • How does GD work in python?
How to use this item?

In the second module, we are going to learn about Gradient Descent as one of the most important optimization algorithms in ML. Andrew Ng presents this concept with a very easy explanation in this video.

Fail to play? Open the link directly:
0 comment


Total time needed: ~2 hours
Learn the theory behind AdaGrad as an optimizer and how to implement it in Python
Potential Use Cases
Adagrad is an algorithm for gradient-based optimization. it is well-suited when dealing with sparse data (NLP or image recognition).
Who is This For ?
Click on each of the following annotated items to see details.
ARTICLE 1. Intro to mathematical optimization
  • What is mathematical optimization?
  • Why do we need to optimize a cost function in ML algorithms?
10 minutes
VIDEO 2. Gradient Descent
  • What is Gradient Decent(GD)?
  • How does GD work in python?
10 minutes
ARTICLE 3. Learning Rate
  • What is learning rate?
  • How can I make it better?
20 minutes
ARTICLE 4. AdaGrad : Introduction (No math!)
  • What is Adagrad?
10 minutes
ARTICLE 5. Adaptive Gradient (adaGrad) : Introduction [ With more advanced math concepts ]
  • What is AdaGrad?
  • What is the math behind this optimizer?
30 minutes
ARTICLE 6. AdaGrad in Python
  • How to implement AdaGrad in Python
10 minutes
PAPER 7. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization (optional)
  • Where does this optimizer come from?
30 minutes

Concepts Covered

0 comment