2019
DOI: 10.3233/jifs-179027
|View full text |Cite
|
Sign up to set email alerts
|

Generating image captions through multimodal embedding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 18 publications
0
1
0
Order By: Relevance
“…The multi-modal translation is an emerging task of the MT community, where visual features of image combine with textual features of parallel source-target text to translate sentences (Shah et al, 2016). Interestingly, multi-modal concept improved the translation quality of generating the captions of the images (Dash et al, 2019) as well as significant improvement over text-only NMT system (Huang et al, 2016). In text-only NMT system, the encoder-decoder framework of NMT is a widely accepted technique used in the task of MT.…”
Section: Introductionmentioning
confidence: 99%
“…The multi-modal translation is an emerging task of the MT community, where visual features of image combine with textual features of parallel source-target text to translate sentences (Shah et al, 2016). Interestingly, multi-modal concept improved the translation quality of generating the captions of the images (Dash et al, 2019) as well as significant improvement over text-only NMT system (Huang et al, 2016). In text-only NMT system, the encoder-decoder framework of NMT is a widely accepted technique used in the task of MT.…”
Section: Introductionmentioning
confidence: 99%