Several deep learning techniques have been intensively reviewed for captioning tasks, enabling the possibility of textual understanding, and description of both simple and complex images. In advancing this knowledge, this paper proposes a multimodal end-to-end siamese difference captioning model (SDCM) to automatically generate a natural language description of differences in an image pair. The proposed supervised learning model combines several deep learning techniques in exploring the practicability of capturing, aligning, and computing the disparities between two image features, for the purpose of creating corresponding language model probability distribution. First, a deep siamese convolutional neural network is used to extract the feature vector discrepancies of an image pair, and then an attention mechanism enables the detection of salient regions of the feature vector which effectively allows a bidirectional long short-term memory decoder to generate a matching and semantically associated textual sequence. The evaluation of the model is tested on the spot-the-diff baseline dataset which consists of pairs of images and their equivalent captions. The results indicate that our proposed model demonstrates a highly competitive performance in comparison to the state of the art. INDEX TERMSDeep convolutional neural network, Siamese network, recurrent neural network, image captioning, deep learning.
The generation of the textual description of the differences in images is a relatively new concept that requires the fusion of both computer vision and natural language techniques. In this paper, we present a novel Fully Convolutional CaptionNet (FCC) that employs an encoder-decoder framework to perform visual feature extractions, compute the feature distances, and generate new sentences describing the measured distances. After extracting the features of the images, a contrastive function is used to compute their weighted L1 distance which is learned and selectively attended to determine salient sections of the feature at every time step. The attended feature region is adequately matched to corresponding words iteratively until a sentence is completed. We propose the application of upsampling network to enlarge the features' field of view, this provides a robust pixel-based discrepancy computation. Our extensive experiments indicate that the FCC model outperforms other learning models on the benchmark Spot-the-Diff datasets by generating succinct and meaningful textual differences in images. INDEX TERMSImage captioning, deep learning, Siamese network, recurrent neural network, convolutional neural network, attention, fully convolutional networks.
Summary Due to the advancement and wide adoption/application of solar‐based technologies, the prediction of solar irradiance has attracted research attention in recent years. In this study, the predictive performance of machine learning models is compared with that of deep learning models for both global solar radiation (GSR) and diffuse solar radiation (DSR) prediction. Different studies have proposed the use of different models for solar radiation prediction. While some used machine learning models, the use of deep learning algorithms were considered by others. Although these algorithms were concluded to be appropriate for solar radiation prediction, variation in their performances brings about an intriguing quest to compare and determine the most appropriate algorithm. The three most common deep learning models in the literature namely; artificial neural network, convolutional neural network, and recurrent neural network (RNN) are considered within the scope of this study. Also, two traditional machine learning models namely polynomial regression and support vector regression (SVR) is considered as well as an ensemble machine learning model called random forest. These models have been applied to four different locations in Nigeria and the typical meteorological year data for 12 years in an hourly time step was used to train/test the model developed. Results from this study show that deep learning models have a better GSR and DSR prediction accuracy in comparison to machine learning models. However, the duration for training and testing the machine learning models (except SVR) is shorter than that of deep learning models making it more desirable for low computational applications. The application of RNN for GSR prediction in Yobe (with an r value of 0.9546 and root means square error/mean absolute error of 82.22 W/m2/36.52 W/m2) had the overall best model performance of all the models developed in this study. This study contributes to the existing literature in this field as it highlights the disparities between machine learning and deep learning algorithms application for solar radiation forecast.
A major development in the field of access control is the dominant role-based access control (RBAC) scheme. The fascination of RBAC lies in its enhanced security along with the concept of roles. In addition, attribute-based access control (ABAC) is added to the access control models, which is famous for its dynamic behavior. Separation of duty (SOD) is used for enforcing least privilege concept in RBAC and ABAC. Moreover, SOD is a powerful tool that is used to protect an organization from internal security attacks and threats. Different problems have been found in the implementation of SOD at the role level. This paper discusses that the implementation of SOD on the level of roles is not a good option. Therefore, this paper proposes a hybrid access control model to implement SOD on the basis of permissions. The first part of the proposed model is based on the addition of attributes with dynamic characteristics in the RBAC model, whereas the second part of the model implements the permission-based SOD in dynamic RBAC model. Moreover, in comparison with previous models, performance and feature analysis are performed to show the strength of dynamic RBAC model. This model improves the performance of the RBAC model in terms of time, dynamicity, and automatic permissions and roles assignment. At the same time, this model also reduces the administrator’s load and provides a flexible, dynamic, and secure access control model.
The scattering of atmospheric particles significantly alters images captured under hazy weather condition. Images appear distorted, blurry and low in contrast attenuation, which extensively affects computer vision systems. There has been development of several prior based methods to address this problem. However, these methods come at a high computational cost. We present a fast, single image dehazing method based on dark channel prior and Rayleigh scattering. Firstly, we present a simple but effective methodology for estimating the atmospheric light through the computation of average, minimum and maximum of the pixels in each of the three RGB colour channels. Then, using the theory of Rayleigh scattering, we model a scattering coefficient to estimate the initial transmission map. Also, a fast-guided filter is adopted to refine the initial transmission map due to inaccurate halo edges. Finally, we restore the haze-free image through the atmospheric scattering model. Extensive qualitative and computational experiments on hazy outdoor images demonstrate that the proposed method produces excellent results whiles achieving a faster processing time.INDEX TERMS Image dehazing, rayleigh scattering, transmission map, image enhancement.
Secure localization of vehicles is gaining the attention of researchers from both academia and industry especially due to the emergence of internet of things (IoT). The modern vehicles are usually equipped with circuitries that gives connectivity with other vehicles and with cellular networks such as 4G/Fifth generation cellar network (5G). The challenge of secure localization and positioning is magnified further with the invention of technologies such as autonomous or driverless vehicles based on IoT, satellite, and 5G. Some satellite and IoT based localization techniques exploit machine learning, semantic segmentation, and access control mechanism. Access control provides access grant and secure information sharing mechanism to authorized users and restricts unauthorized users, which is necessary regarding security and privacy of government or military vehicles. Previously, static conflict of interest (COI) based access control was used for security proposes. However, static COI based access control creates excesses and administrative overload that creates latency in execution, which is the least tolerable factor in modern IoT or 5G control vehicles. Therefore, in this paper, a hybrid access control (HAC) model is proposed that implements the dynamic COI in the HAC model on the level of roles. The proposed model is enhanced by modifying the role-based access control (RBAC) model by inserting new attributes of the RBAC entities. The HAC model deals with COI on the level of roles in an efficient manner as compared to previously proposed models. Moreover, this model features significant improvement in terms of dynamic behavior, decreased administrative load, and security especially for vehicular localization. Furthermore, the mathematical modeling of the proposed model is implemented with an example scenario to validate the concept. INDEX TERMS Access control, hybrid access control, secure vehicle localization, machine learning, neural networks, Internet of Things.
The ongoing coronavirus 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in a severe ramification on the global healthcare system, principally because of its easy transmission and the extended period of the virus survival on contaminated surfaces. With the advances in computer-aided diagnosis and artificial intelligence, this paper presents the application of deep learning and adversarial network for the automatic identification of COVID-19 pneumonia in computed tomography (CT) scans of the lungs. The complexity and time limitation of the reverse transcription-polymerase chain reaction (RT-PCR) swab test makes it disadvantageous to depend solely on as COVID-19’s central diagnostic mechanism. Since CT imaging systems are of low cost and widely available, we demonstrate that the drawback of the RT-PCR can be alleviated with a faster, automated, and reduced contact diagnostic process via the use of a neural network model for the classification of infected and noninfected CT scans. In our proposed model, we explore the benefit of transfer learning as a means of resolving the problem of inadequate dataset and the importance of semisupervised generative adversarial network for the extraction of well-mapped features and generation of image data. Our experimental evaluation indicates that the proposed semisupervised model achieves reliable classification, taking advantage of the reflective loss distance between the real data sample space and the generated data.
The coronavirus disease of 2019 (COVID-19) pandemic has caused a global public health epidemic since there is no 100% vaccine to cure or prevent the further spread of the virus. With the ever-increasing number of new infections, creating automated methods for COVID-19 identification of Chest X-ray images is critical to aiding clinical diagnosis and reducing the time-consumption for image interpretation. This paper proposes a novel joint framework for accurate COVID-19 identification by integrating an enhanced super-resolution generative adversarial network with a noise reduction filter bank of wavelet transform convolutional neural network on both Chest X-ray and Chest Tomography images for COVID-19 identification. The super-resolution utilized in this study is to enhance the image quality while the wavelet transform Convolutional Neural Network architecture is used to accurately identify COVID-19. Our proposed architecture is very robust to noise and vanishing gradient problem. We used public domain datasets of Chest x-ray images and Chest Tomography to train and check the performance of our COVID-19 identification task. This experiment shows that our system is consistently efficient by accuracy of 0.988, sensitivity of 0.994, and specificity of 0.987, AUC of 0.99, F1-score of 0.982 and 0.989 for precision using the Chest X-ray dataset while for Chest Tomography dataset, an accuracy of 0.978, sensitivity of 0.981, and specificity of 0.979, AUC of 0.985, F1-score of 0.961 and precision of 0.980. These performances have also outweighed other established state-ofthe-art learning methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.