2023
DOI: 10.1016/j.displa.2023.102377
|View full text |Cite
|
Sign up to set email alerts
|

Relational-Convergent Transformer for image captioning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(1 citation statement)
references
References 23 publications
0
0
0
Order By: Relevance
“…Early works [22] include developing models based on long short-term memory (LSTM [23]) [14,24,25]and convolutional neural network (CNN) models. Recently, transformer [26] model-based attention modules [13,27,28] have been widely used due to their outstanding ability to handle visual and language features. Although the performance is attractive, and the image captioning task is somewhat similar to the radiology report generation task, these excellent methods in the general image captioning task have limited applicability to the radiology report generation task [29,30].…”
Section: Image Captioningmentioning
confidence: 99%
“…Early works [22] include developing models based on long short-term memory (LSTM [23]) [14,24,25]and convolutional neural network (CNN) models. Recently, transformer [26] model-based attention modules [13,27,28] have been widely used due to their outstanding ability to handle visual and language features. Although the performance is attractive, and the image captioning task is somewhat similar to the radiology report generation task, these excellent methods in the general image captioning task have limited applicability to the radiology report generation task [29,30].…”
Section: Image Captioningmentioning
confidence: 99%