2023
DOI: 10.1016/j.eswa.2023.119773
|View full text |Cite
|
Sign up to set email alerts
|

Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 166 publications
0
4
0
Order By: Relevance
“…Equation (11) shows the expression used to evaluate the logarithmic decrement of the discrete signal, while Equation ( 12) describes the relation that expresses the damping ratio ξ 1 in the case of an underdamped vibrating system:…”
Section: Numerical Activitymentioning
confidence: 99%
See 1 more Smart Citation
“…Equation (11) shows the expression used to evaluate the logarithmic decrement of the discrete signal, while Equation ( 12) describes the relation that expresses the damping ratio ξ 1 in the case of an underdamped vibrating system:…”
Section: Numerical Activitymentioning
confidence: 99%
“…Today, however, most studies present in the literature consist of sloshing applications for vibration mitigation activities in structures [4][5][6]. In this work, the authors defined a simplified multibody model developed through a CFD analysis to determine the optimal operating conditions for serial manipulators used in visual control stations for glass containers [7][8][9][10][11]. The SimScape multibody multidomain simulation environment is increasingly used in complex systems analysis due to its ability to model the different subdomains that characterize real systems in a single environment [12][13][14].…”
Section: Introductionmentioning
confidence: 99%
“…In [9] paper examines workflows, feature representation, visual encoding, language generation models, data-sets, and assessments of deep photo captioning approaches in natural and medical sciences. The author [10] assisting the visually impaired, interacting with robots, and video surveillance systems are just a few examples of the many applications that Automatic Visual Captioning (AVC), a deep-learning technology, is utilized in. [11] This review paper weighs the benefits and drawbacks of Image-Caption Generator, an AI technology that helps people with visual impairments comprehend images and define language.…”
Section: Overview Of An Earlier Review On Image Captioningmentioning
confidence: 99%
“…Fig. 1 demonstrates a typical deeplearning-based approach for image-to-text description [5].Concerning the encoder-decoder image captioning approaches, Convolutional Neural Networks (CNNs) have been exploited as encoders for visual feature extraction from the images, and Recurrent Neural Networks (RNNs), "especially LSTM (Long Short-Term Memory) networks" have been exploited as decoders for transforming the obtained features into various natural languages [6,7]. However, encoder-decoder-based approaches are not capable of analyzing the images over time and considering the spatial prospects of images that are pertinent to the image description (alternatively, creating descriptions for the entire scene).…”
Section: Introductionmentioning
confidence: 99%