Covers: theory of Transformer
Estimated time needed to finish: 7 minutes
Questions this item addresses:
  • How can we reduce the computational cost of the attention calculations, which grow quadratically with sequence length?
How to use this item?

This blog post summarizes main Longformer paper in simple and easy to understand.

Author(s) / creator(s) / reference(s)
refer to the link
0 comment
Recipe
publicShare
Star0

Understand the paper : Learning Longformer:The Long-Document Transformer

Contributors
Total time needed: ~2 hours
Objectives
With this list you can learn about Longformer and how to implement it.
Potential Use Cases
Long text summarization, Long text question answering.
Who is This For ?
ADVANCEDNLP Data scientist from all audience levels
Click on each of the following annotated items to see details.
ARTICLE 1. Introduction to Transformer Encoder Decoder Model
  • What is transformer and how it works?
10 minutes
ARTICLE 2. Attentions Mechanism In Neural Machine Translation
  • What is attention and type of attention mechanisms?
20 minutes
PAPER 3. Longformer : The Long-Document Transformer (Original Paper)
  • What is long former and attention mechanism to process long sequence?
20 minutes
ARTICLE 4. Understanding Transformer-Based Self-Supervised Architectures - LongFormer
  • How can we reduce the computational cost of the attention calculations, which grow quadratically with sequence length?
7 minutes
REPO 5. Longformer : The Long-Document Transformer Github Repo
  • How longformer is implemented?
10 minutes
ARTICLE 6. Train a Longformer for Detecting Hyper-partisan News
  • How to train a longformer?
7 minutes
ARTICLE 7. Train a Longformer for the Question Answering
  • How to implement longformer for question answering task?
7 minutes

Concepts Covered

0 comment