Neighbourhood Based Collaborative Filtering - Basic ideas

Total time needed: ~2 hours
Learning Objectives
Helps to quickly understand basic ideas of neighbourhood based collaborative filtering
Potential Use Cases
Build recommendation engines, Identify trends etc
Target Audience
BEGINNERPython developers
Go through the following annotated items in order:
ARTICLE 1. Ratings Matrix
  • What is the basic data structure of this Matrix and what are its properties?
  • What is a long-tail and how does it impact recommendation systems and what can we do about them?
30 minutes
ARTICLE 2. User-based and Item-based similarity Computations
  • What are user-based and item-based models and how does one perform compared to another?
  • How can we build a Rec Sys based on similarity based concepts like Cosine Similarity, Pearson Correlation etc? What does amplifying a similarity function do?
  • What is mean centering and why do we use them? What are the common alternatives?
  • How does inverse user frequency help to handle long-tail? Why should we handle long-tails in the first place?
  • What is the computational complexity of the overall system? How to understand off-line phase and online-phase?
30 minutes
ARTICLE 3. Clustering and Similarity Based methods
  • What is the problems of Sparsity as well as computational complexity?
  • what are SVD, PCA and k-means?
  • How to do MLE for estimating missing values in the Ratings Matrix?
20 minutes
ARTICLE 4. Regression View of Neighbourhood based method
  • How the similarity coefficients that we use is similar to learned weights in a linear regression model?
  • How to use Least Squares optimisation to learn the coefficients?
  • How to handle sparsity and bias issues?
  • How to understand regularization?
30 minutes

Concepts Covered