Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning

Oluwasammi, Ariyo; Aftab, Muhammad Umar; Qin, Zhiguang; Son, Ngo Tung; Doan, Thang Van; Nguyen, Son Ba; Nguyen, Giang

doi:10.1155/2021/5538927

Cited by 19 publications

(12 citation statements)

References 167 publications

(135 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Meanwhile, the decoder part uses Bird Swarm Algorithm (BSA) with LSTM technique so as to concentrate on the generation of descriptive sentences. In an earlier study [19], image captioning and semantic segmentation were widely inspected on the basis of advanced and traditional methods. In this study, the researchers detailed about the application of DL in segmentation examination of both 3D and 2D images utilizing FCN and other high-level hierarchical feature extraction methodologies.…”

Section: Literature Reviewmentioning

confidence: 99%

Natural Language Processing with Optimal Deep Learning-Enabled Intelligent Image Captioning System

Marzouk¹,

Alabdulkreem²,

Nour³

et al. 2023

Computers, Materials &Amp; Continua

View full text Add to dashboard Cite

The recent developments in Multimedia Internet of Things (MIoT) devices, empowered with Natural Language Processing (NLP) model, seem to be a promising future of smart devices. It plays an important role in industrial models such as speech understanding, emotion detection, home automation, and so on. If an image needs to be captioned, then the objects in that image, its actions and connections, and any silent feature that remains under-projected or missing from the images should be identified. The aim of the image captioning process is to generate a caption for image. In next step, the image should be provided with one of the most significant and detailed descriptions that is syntactically as well as semantically correct. In this scenario, computer vision model is used to identify the objects and NLP approaches are followed to describe the image. The current study develops a Natural Language Processing with Optimal Deep Learning Enabled Intelligent Image Captioning System (NLPODL-IICS). The aim of the presented NLPODL-IICS model is to produce a proper description for input image. To attain this, the proposed NLPODL-IICS follows two stages such as encoding and decoding processes. Initially, at the encoding side, the proposed NLPODL-IICS model makes use of Hunger Games Search (HGS) with Neural Search Architecture Network (NASNet) model. This model represents the input data appropriately by inserting it into a predefined length vector. Besides, during decoding phase, Chimp Optimization Algorithm (COA) with deeper Long Short Term Memory (LSTM) approach is followed to concatenate the description sentences 4436 CMC, 2023, vol.74, no.2 produced by the method. The application of HGS and COA algorithms helps in accomplishing proper parameter tuning for NASNet and LSTM models respectively. The proposed NLPODL-IICS model was experimentally validated with the help of two benchmark datasets. A widespread comparative analysis confirmed the superior performance of NLPODL-IICS model over other models.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

Natural Language Processing with Optimal Deep Learning-Enabled Intelligent Image Captioning System

Marzouk¹,

Alabdulkreem²,

Nour³

et al. 2023

Computers, Materials &Amp; Continua

View full text Add to dashboard Cite

show abstract

“…The emotional attributes of short text can be minimized, and then the functions extracted from these two channels are folded and input into the classifier that determines the emotion of the text [ 26 , 27 ]. Each channel of the CNN directly affects the original data, and then the subsequent layers of the multilayer CNN will affect the processed data, so the CNN can extract more direct functions from these two channels [ 28 ]. Figure 7 shows the operation process of the model.…”

Section: Model Establishment and Scheme Designmentioning

confidence: 99%

Application of Dual‐Channel Convolutional Neural Network Algorithm in Semantic Feature Analysis of English Text Big Data

2021

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

The current Internet data explosion is expecting an ever-higher demand for text emotion analysis that greatly facilitates public opinion analysis and trend prediction, among others. Therefore, this paper proposes to use a dual-channel convolutional neural network (DCNN) algorithm to analyze the semantic features of English text big data. Following the analysis of the effect of CNN, artificial neural network (ANN), and recurrent neural network (RNN) on English text data analysis, the more effective long short-term memory (LSTM) and the gated recurrent unit (GRU) neural network (NN) are introduced, and each network is combined with the dual-channel CNN, respectively, and comprehensively analyzed under comparative experiments. Second, the semantic features of English text big data are analyzed through the improved SO-pointwise mutual information (SO-PMI) algorithm. Finally, the ensemble dual-channel CNN model is established. Under the comparative experiment, GRU NN has a better feature detection effect than LSTM NN, but the performance increase from dual-channel CNN to GRU NN + dual-channel CNN is not obvious. Under the comparative analysis of GRU NN + dual-channel CNN model and LSTM NN + dual-channel CNN model, GRU NN + dual-channel CNN model ensures the high accuracy of semantic feature analysis and improves the analysis speed of the model. Further, after the attention mechanism is added to the GRU NN + dual-channel CNN model, the accuracy of semantic feature analysis of the model is improved by nearly 1.3%. Therefore, the ensemble model of GRU NN + dual-channel CNN + attention mechanism is more suitable for semantic feature analysis of English text big data. The results will help the e-commerce platform to analyze the evaluation language and semantic features for the current network English short texts.

show abstract

“…There are six traditional image segmentation methods based on the threshold, edge detection, graph theory, region, clustering, and specific theoretical tools. For example, reference [ 12 ] analyzed the principles, advantages, and disadvantages of image semantic segmentation based on traditional methods and deep learning methods and pointed out that deep learning network had better optimization results than traditional methods. Reference [ 13 ] proposed a new image redirection method using semantic segmentation and pixel fusion, which could finely reassign the scaling factor for each region according to the semantic segmentation results, so as to effectively reduce the geometric distortion in the process of image redirection, but the detection efficiency needs to be improved.…”

Section: Introductionmentioning

confidence: 99%

Image Semantic Segmentation Method Based on Deep Fusion Network and Conditional Random Field

Wang

Yang

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Aiming at the problems of missing points and wrong points in image semantic segmentation under complex background and small target, an image semantic segmentation method based on the fully convolution neural network and conditional random field is proposed. First, the deconvolution fusion structure is added to the fully convolution neural network to build a deep fusion network. The multiscale features are automatically obtained through the deep fusion network, and the shallow detail information and deep semantic information are fused to improve the processing accuracy of image rough segmentation. Then, the bivariate potential function of the conditional random field is optimized based on the convolution neural network, and it is used for image fine segmentation to obtain the final image segmentation result. Finally, the proposed method is experimentally analyzed based on the Cityscapes dataset. The results show that the proposed method can achieve accurate image segmentation, and the area under the segmentation curve of the overall size target is 93.6%, which is better than other methods.

show abstract

Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning

Cited by 19 publications

References 167 publications

Natural Language Processing with Optimal Deep Learning-Enabled Intelligent Image Captioning System

Natural Language Processing with Optimal Deep Learning-Enabled Intelligent Image Captioning System

Application of Dual‐Channel Convolutional Neural Network Algorithm in Semantic Feature Analysis of English Text Big Data

Image Semantic Segmentation Method Based on Deep Fusion Network and Conditional Random Field

Contact Info

Product

Resources

About