Step 1: Feature Extraction - As FE is necessary to even start to construct the databases they used, which were integral to the first encoder. They needed lists of features with sentiment for each image, so there was a lot that went into using MIRQI and a label extractor to get the databases up and running.

Step 2: Attention Mechanism - Before you start building the models that make use of the datasets you learned how to construct in Step 1, you have got to understand what makes the Transformer so different, why it is more efficient at parallel computing, how it is contextual and why that is important.

Step 3: Visual Language Model - After the dataset is constructed, and you understand the mechanics behind the model, you can start understanding the architecture of the model they built in the paper.

Generally, the assets start with a simple explanation of it on Towards Data Science or Medium, and then go to a peer-reviewed work once you have a general understanding of it.

Covers: implementation of Natural Language Generation
Estimated time needed to finish: 3 minutes
Questions this item addresses:
  • Why does the recipe for "Understanding the Paper: Progressive Transformer-Based Generation of Radiology Reports" have the three supporting concept assets sorted in this order?
0 comment
Recipe
publicShare
Star(0)

Understanding the Paper: Progressive Transformer-Based Generation of Radiology Reports

Contributors
Total time needed: ~3 hours
Objectives
Learn what goes into building a multi-stage transformer model by exploring three fundamental topics: Feature Extraction, Attention Mechanism and Visual Language Models.
Potential Use Cases
Clinical Data Mining, Automated & Reproducible Medical Diagnosis, Coherent Report Generation from Images
Who is This For ?
INTERMEDIATENLP or Computer Vision Specialists or Enthusiasts
Click on each of the following annotated items to see details.
Resource Asset7/10
WRITEUP 1. Introduction: Supporting Concepts to Understand Transformer-Based Report Generation
  • Why does the recipe for "Understanding the Paper: Progressive Transformer-Based Generation of Radiology Reports" have the three supporting concept assets sorted in this order?
3 minutes
PAPER 2. An Overview of Image Caption Generation Methods
  • What are the main types of feature extraction methods for images?
  • How may transformers be leveraged to extract features from images?
30 minutes
PAPER 3. When Radiology Report Generation Meets Knowledge Graph
  • How did Noorallahzadeh et al., the authors of our paper of interest "Progressive Transformer-Based Generation of Radiology Reports", construct the training datasets through the usage of MIRQI?
15 minutes
PAPER 4. Attention Is All You Need
  • How do Transformers differ from CNNs and RNNs?
30 minutes
ARTICLE 5. Transformers in Computer Vision: Farewell Convolutions!
  • How can transformers with attention mechanism overcome limitations in convolutional models?
14 minutes
ARTICLE 6. An Overview of ResNet and its Variants
  • What is needed to construct a visual backbone?
15 minutes
PAPER 7. Visual Language Model Content: Generating Radiology Reports via Memory-driven Transformer
  • How was the original idea for the visual language modeling described?
10 minutes
ARTICLE 8. Revealing BART : A denoising objective for pretraining
  • How does BART contribute to the fine-tuning of the Language Model within Noorallahzadeh et al.'s framework?
6 minutes
PAPER 9. Natural Language Generation Content: Generating Radiology Reports via Memory-driven Transformer
  • How were recent radiology report generation conducted using memory-driven transformers?
10 minutes
PAPER 10. Progressive Transformer-Based Generation of Radiology Reports
  • How may image-to-text-to-text be leveraged to generate radiology reports?
25 minutes

Concepts Covered

0 comment