Covers: theory of Text-to-Text Transformers
Estimated time needed to finish: 180 minutes
Questions this item addresses:
  • 1- What are the effects of the pre-training objective on the model's performance?
  • 2- What are the effects of dataset size on performance?
  • 3- What are the effects of model architecture on its perfomance?
  • 4- What is the best fine-tuning strategy?
How to use this item?

Read the entire thing

Author(s) / creator(s) / reference(s)
Raffel et. al
0 comment
Recipe
publicShareStar

T5 paper

Collaborators
Total time needed: ~5 hours
Objectives
Exploring different topics discussed in this large scale study
Potential Use Cases
Various NLP tasks such as translation, questions answering, summarisation
Who is this for ?
ADVANCED
Click on each of the following annotated items to see details.
PAPER 1. Original paper
  • 1- What are the effects of the pre-training objective on the model's performance?
  • 2- What are the effects of dataset size on performance?
  • 3- What are the effects of model architecture on its perfomance?
  • 4- What is the best fine-tuning strategy?
180 minutes
VIDEO 2. A useful video that explores the T5 large-scale study on Transfer Learning
  • 1- What are the main highlights and takeaways of the T5 paper?
25 minutes
ARTICLE 3. BERT's pre-training objective
  • 1- How is BERT pre-trained?
10 minutes
ARTICLE 4. XLNET's pre-training objective
  • 1- How is XLNET pre-trained?
10 minutes
ARTICLE 5. Transformers
  • 1- What are transformers?
  • 2- How do transformers work?
60 minutes

Concepts Covered

0 comment