2023
DOI: 10.30630/joiv.7.2.1387
|View full text |Cite
|
Sign up to set email alerts
|

Pre-Trained CNN Architecture Analysis for Transformer-Based Indonesian Image Caption Generation Model

Abstract: Classification and object recognition in image processing has significantly improved computer vision tasks. The method is often used for visual problems, especially in picture classification utilizing the Convolutional Neural Network (CNN). In the popular state-of-the-art (SOTA) task of generating a caption on an image, the implementation is often used for feature extraction of an image as an encoder. Instead of performing direct classification, these extracted features are sent from the encoder to the decoder… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
references
References 29 publications
0
0
0
Order By: Relevance