Count-based models

Kevin McPhersonTotal time needed: ~2 hours
Learning Objectives
With this list, you will learn about the count-based way of constructing vector space models (VSMs) in NLP practices
Potential Use Cases
Figuring out co-occurrence of various words in corpuses or the topic model of a specific document
Target Audience
BEGINNERPython beginners to machine learning
Go through the following annotated items in order:
OTHER 1. Vector Semantics
  • How are the count-based models represented within the matrix?
20 minutes
OTHER 2. Improve Simple Co-Occurrence Counts
  • Are context words at different distances equally important? If not, how can we modify co-occurrence counts?
  • In language, word order is important; specifically, left and right contexts have different meanings. How can we distinguish between the left and right contexts?
30 minutes
ARTICLE 3. Creating a sparse Document Term Matrix for Topic Modeling with LDA
  • How do you create a term-document model from scratch and apply one of the common-use applications?
20 minutes

Concepts Convered