2024
DOI: 10.56578/ida030102
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing Image Captioning and Auto-Tagging Through a FCLN with Faster R-CNN Integration

Shalaka Prasad Deore,
Taibah Sohail Bagwan,
Prachiti Sunil Bhukan
et al.

Abstract: In the realm of automated image captioning, which entails generating descriptive text for images, the fusion of Natural Language Processing (NLP) and computer vision techniques is paramount. This study introduces the Fully Convolutional Localization Network (FCLN), a novel approach that concurrently addresses localization and description tasks within a singular forward pass. It maintains spatial information and avoids detail loss, streamlining the training process with consistent optimization. The foundation o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 16 publications
0
1
0
Order By: Relevance
“…It has found extensive applications in natural language processing. With the continuous integration and development of the fields of natural language processing and computer vision [27], Vision Transformer (ViT) [28] pioneered the introduction of Transformer into computer vision, demonstrating favorable results. In 2021, TrSiam [13] combined Transformers with correlation filters, becoming the first Siamese object tracking method with a Transformer structure that enhanced the tracker's feature representation capability.…”
Section: Transformer Trackingmentioning
confidence: 99%
“…It has found extensive applications in natural language processing. With the continuous integration and development of the fields of natural language processing and computer vision [27], Vision Transformer (ViT) [28] pioneered the introduction of Transformer into computer vision, demonstrating favorable results. In 2021, TrSiam [13] combined Transformers with correlation filters, becoming the first Siamese object tracking method with a Transformer structure that enhanced the tracker's feature representation capability.…”
Section: Transformer Trackingmentioning
confidence: 99%