Covers: theory of Sparse Sequence-to-Sequence Models

- What are Fenchel-Young Losses?

Read the beginning of the paper and section 3 ( which starts on page 2).

Mathieu Blondel, André F.T. Martins, Vlad Niculae

0 comment

Contributors

- Objectives
- Understand the smoothing and shrinking the sparse Seq2Seq search space paper.
- Potential Use Cases
- Pronunciation (computer reading text), text to speech and morphological inflection.
- Who is This For ?
- INTERMEDIATENatural Language Processing (NLP) developers looking to understand how to better sequence-to-sequence models.

Click on each of the following **annotated items** to see details.

Resource Asset3/10

REPO 1. Background on Neural Machine Translation

- What is neural machine translation?
- How does neural machine translation work?

5 minutes

ARTICLE 2. Regularization Techniques

- What is Regularization?
- How does Regularization help in reducing Overfitting?

5 minutes

PAPER 3. Sparse Sequence-to-Sequence Models

- What are sequence-to-sequence models?
- What are dense attention alignments?
- What are neural sparse seq2seq models?

10 minutes

WRITEUP 4. Application: Smoothing and Shrinking the Sparse Seq2Seq Search Space

- What are the use cases of smoothing and shrinking the sparse Seq2Seq search space?

10 minutes

PAPER 5. On NMT Search Errors and Model Errors: Cat Got Your Tongue?

- What is the "cat got your tongue" problem?
- Why and where does this problem occur?

5 minutes

PAPER 6. Correcting Length Bias in Neural Machine Translation

- What are the two main problems in neural machine translation?
- What is the beam problem and the brevity problem?

15 minutes

WRITEUP 7. Existing solutions for the cat got your tongue problem

- What are existing solutions for the cat got your tongue problem?

2 minutes

PAPER 8. Six Challenges for Neural Machine Translation

- What the "beam search curse"?

3 minutes

PAPER 9. Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms

- What are Fenchel-Young Losses?

6 minutes

PAPER 10. Smoothing and Shrinking the Sparse Seq2Seq Search Space

- Why are entmax-based seq2seq models better?

8 minutes

0 comment