Covers: theory of Self-Attention
Estimated time needed to finish: 14 minutes
Questions this item addresses:
  • What are attention mechanisms?
  • How does time series data get used with self-attention?
How to use this item?

Watch from 4:44 to 10:03

Fail to play? Open the link directly: https://www.youtube.com/watch?v=yGTUuEx3GkA
Author(s) / creator(s) / reference(s)
Rasa
0 comment
Recipe
publicShare
Star(0)

Understand Self Attention

Contributors
Total time needed: ~37 minutes
Objectives
With this shortlist you will understand how the self-attention mechanism can relate different positions of a single sequence in order to compute a representation of the sequence.
Potential Use Cases
Help Google better discern the context of words in search queries.
Who is This For ?
BEGINNERBeginners looking to understand the Attention Is All You Need research paper.
Click on each of the following annotated items to see details.
Resources3/3
VIDEO 1. Rasa Algorithm Whiteboard - Attention 1: Self Attention
  • What are attention mechanisms?
  • How does time series data get used with self-attention?
14 minutes
ARTICLE 2. The Illustrated Transformer
  • What does a high-level look for a transformer look like?
  • What makes up an encoding component?
  • What makes up a decoding component?
8 minutes
VIDEO 3. The Narrated Transformer Language Model
  • How can you assign meaning to numbers via embeddings?
  • What are token embeddings?
  • How does the last hidden state predict the last word?
  • What does a softmax function do?
15 minutes

Concepts Covered

0 comment