Past Recording
Share
Star()
Joint Policy-Value Learning for Recommendation
Tuesday Aug 11 2020 16:00 GMT
Please to join the live chat.
Joint Policy-Value Learning for Recommendation
Why This Is Interesting

Beating offline metrics in Recommender System is challenging but the real question would be how effective is the model in online metrics. This paper utilizes logged data from a model to come up with a higher online evaluation scores

Discussion Points
  1. Using an existing policy (model) to increase online KPIs by applying dual bandit
  2. Applying weighted average to maximize likelihood(MLE) and minimizing counterfactual risk management to ensure more personalized recommendations
  3. Using recommended items (logged data) to improve personalization by applying a dual bandit setting to learn from unclicked recommendations
Time of Recording: Tuesday Aug 11 2020 16:00 GMT