Covers: theory of Domain agnostic AI
Estimated time needed to finish: 40 minutes
Questions this item addresses:
  • How to design architecture for inputs and outputs of arbitrary size and semantics?
  • What is the role of Transformers and self-attention in domain agnostic archytectures?
  • How does Perceiver IO elaborates and improve its previous version Perceiver?
How to use this item?

Read the highl level overview of how Peciever IO elaborates on its previous version Perciever. Peciever IO is pubhlished here at ArXiv!

Also, you can read about the Perciever IO summary at this medium post.

Aleksa Gordić, from YouTube channel The AI Epiphany, he breaks down the paper in this video. The video is less visual that the one from Yannic Kilcher, but from 25:00, Aleksa starts talking about the Perciever IO! (Btw. Aleksa was hired at DeepMind after recorded the video!!!!.

Author(s) / creator(s) / reference(s)
Drew Jaegle, Joao Carreira, Carl Doersch, David Ding, Catalin Ionescu
0 comment
Recipe
publicShare
Star(0)

Perceivers: General Models For Any Data

Contributors
Total time needed: ~3 hours
Objectives
Understand how Perceivers, build on transformers like architecture, can generalize in multi-domain applications while solving the quadratic bottleneck.
Potential Use Cases
Multi-domain application and understanding/dealing many datatypes at once. Can also potentially replace state-of-the-art transformers (e.g. BERT) and ViTs models as no preprocessing (e.g. tokenization) is needed.
Who is This For ?
ADVANCEDMachine Learning Scientists willing to experiment with pre-trained multi-domain model.
Click on each of the following annotated items to see details.
Resources4/6
VIDEO 1. Self-attention: Whiteboard video series
  • Why we may need to transform signals in sequences?
  • May interaction of signals in a sequence be useful?
  • What is the high-level idea of self-attention?
  • What are Key, Query, Values and how do they interact with the processing information in the self-attention architecture?
  • What is the scheme of the self-attention in the neural network?
  • How does multi-head self-attention look like?
  • How to process information through multi-head self-attention and still end up with input dimension output?
45 minutes
VIDEO 2. Transformer Encoder: Whiteboard video
  • What is the role of self-attention mechanism in Transformer architecture?
  • What are the main components of a Transformer?
  • What is the role and characteristics of Transformer Encoder?
  • Whats is the Positional Encoding and what important feature of attention mechanism does it solves?
  • What is advantage of the multi-head self-attention over a recurrent neural network (RNN)?
20 minutes
ARTICLE 3. Paper summary: “Perceiver : General Perception with Iterative Attention”
  • Why to use Transformers architecture for not-only NLP task?
  • What are the obstacles for using transformers out of NLP domain?
  • What is quadratic complexity and how the Perceiver tackles it?
  • You may know self-attention but what is cross-attention and why it may be useful?
40 minutes
ARTICLE 4. Perceiver and Perceiver IO work as multi-purpose tools for AI
  • How to design architecture for inputs and outputs of arbitrary size and semantics?
  • What is the role of Transformers and self-attention in domain agnostic archytectures?
  • How does Perceiver IO elaborates and improve its previous version Perceiver?
40 minutes
REPO 5. DeepMind Perceivers: GitHub Repository
  • What are the key differences between Perceiver and its successor Perceiver IO?
  • How to use pretrained Perceiver IO modes l in Colab notebook?
  • How to use the training scripts?
15 minutes
OTHER 6. Perceivers: Performance Discussion on Reddit
  • What the ML practitioners thing about Perceiver's perform on lower volume of training data?
5 minutes

Concepts Covered

0 comment