Findings of the Association for Computational Linguistics: EMNLP 2021 2021
DOI: 10.18653/v1/2021.findings-emnlp.241
|View full text |Cite
|
Sign up to set email alerts
|

Progressive Transformer-Based Generation of Radiology Reports

Abstract: Inspired by Curriculum Learning, we propose a consecutive (i.e., image-to-text-to-text) generation framework where we divide the problem of radiology report generation into two steps. Contrary to generating the full radiology report from the image at once, the model generates global concepts from the image in the first step and then reforms them into finer and coherent texts using a transformer architecture. We follow the transformer-based sequence-tosequence paradigm at each step. We improve upon the state-of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 31 publications
(10 citation statements)
references
References 18 publications
0
5
0
Order By: Relevance
“…LLMs have demonstrated significant performance gains for medical problem summarization tasks [ 42 ]. We note preliminary reports of the use of LLMs to generate radiological reports from images [ 43 ]. These examples demonstrate the potential for use cases for medical charting assistance.…”
Section: Discussionmentioning
confidence: 99%
“…LLMs have demonstrated significant performance gains for medical problem summarization tasks [ 42 ]. We note preliminary reports of the use of LLMs to generate radiological reports from images [ 43 ]. These examples demonstrate the potential for use cases for medical charting assistance.…”
Section: Discussionmentioning
confidence: 99%
“…They proposed a relational memory (RM) module to retain knowledge from previous cases, thereby enabling the generator model to remember similar reports when generating current reports. Another study ( 7 ) proposed a progressive Transformer-based framework for report generation, which generates high-level context from the given x-ray and then employs the Transformer architecture to convert this context into a radiology report. This model comprises a pre-trained CNN as a visual backbone, a mesh-memory Transformer ( 17 ) as a visual-language model, and BART ( 21 ) as a language model.…”
Section: Relevant Literaturementioning
confidence: 99%
“…Our BLEU-1 results exhibit strong concordance with existing report generation literature, which has established scoring norms averaging 0.3 to 0.4 for this metric. We conducted a comparative analysis of our model against relevant state-of-the-art models ( 23 , 24 , 7 , 22 ), referencing the results documented in their published literature. When considering ROUGE-L score, which reflects the model’s ability to capture document-level linguistic coherence, our approach excelled in this aspect, achieving a ROUGE-L score of 0.331, standing out as the highest score across all models.…”
Section: Experiments and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…First they identified the disease from the image using a CNN and then used prior knowledge for report generation. Nooralahzadeh et al [20] proposed a two-step model which derived global concepts from the image then reformed them into finer and coherent texts using a transformer architecture. You et al [22] proposed a AlignTransformer framework, Align Hierarchical Attention (AHA) and Multi-Grained Transformer (MGT) were the components of the AlignTransformer framework.…”
Section: Medical Report Generationmentioning
confidence: 99%