2022
DOI: 10.14569/ijacsa.2022.0130886
|View full text |Cite
|
Sign up to set email alerts
|

CapNet: An Encoder-Decoder based Neural Network Model for Automatic Bangla Image Caption Generation

Abstract: Automatic caption generation from images has become an active research topic in the field of Computer Vision (CV) and Natural Language Processing (NLP). Machine generated image caption plays a vital role for the visually impaired people by converting the caption to speech to have a better understanding of their surrounding. Though significant amount of research has been conducted for automatic caption generation in other languages, far too little effort has been devoted to Bangla image caption generation. In t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 14 publications
0
1
0
Order By: Relevance
“…As opposed to traditional encoder-decoder architectures, Bahdanau Attention Based Bengali Image Caption Generation (BABBICG) [87] uses an InceptionV3 neural network to extract features from images and produce captions using an RNN decoder. [88] describe a model based on encoders and decoders that takes a picture as input and outputs the appropriate Bangla caption. The decoder network is made up of Bidirectional LSTMs for caption synthesis, whereas the encoder network is made up of ResNet-50, a pretrained image feature extractor.…”
Section: Image Captioning For Bengali Languagementioning
confidence: 99%
“…As opposed to traditional encoder-decoder architectures, Bahdanau Attention Based Bengali Image Caption Generation (BABBICG) [87] uses an InceptionV3 neural network to extract features from images and produce captions using an RNN decoder. [88] describe a model based on encoders and decoders that takes a picture as input and outputs the appropriate Bangla caption. The decoder network is made up of Bidirectional LSTMs for caption synthesis, whereas the encoder network is made up of ResNet-50, a pretrained image feature extractor.…”
Section: Image Captioning For Bengali Languagementioning
confidence: 99%