Convolutional neural networks (CNNs) provide the sensing and detection community with a discriminative approach for classifying images. However, one of the largest limitations for deep CNN image classifiers is the need for extensive training datasets containing a variety of image representations. While current methods such as GAN data augmentation, additions of noise, rotations, and translations can allow CNNs to better associate new images and their feature representations to ones of a learned image class, many fail to provide new contexts of ground truth feature information. To expand the association of critical class features within CNN image training datasets, an image pairing and training dataset augmentation paradigm via a multi-sensor domain image data fusion algorithm is proposed. This algorithm uses a mutual information and merit-based feature selection subroutine to pair highly correlated cross domain images from multiple sensor domain image datasets. It then re-augments the corresponding cross domain image pairs into the opposite sensor domain's feature set via a highest mutual information, cross sensor domain, image concatenation function. This augmented image set then acts to retrain the CNN to recognize greater generalizations of image class features via cross-domain, mixed representations. Experimental results indicated an increased ability of CNNs to generalize and discriminate between image classes during testing of class images from SAR vehicle, solar cell device reliability screening and lung cancer detection image datasets.
Deep machine learning computer vision algorithms have been widely explored for the purpose of multisensory data fusion. The ability to combine feature, pixel, and decision level information from multiple sensors in order to enhance accurate assessments and decisions made by the platforms has been a significant point of interest for the remote sensing community. In this paper, we propose a dual branch 3D convolutional neural network (CNN) to bi-long short-term memory network (BILSTM) algorithm that seeks to fuse sparse multiresolution, multi-pose and multimodal VV and HV polarizations of synthetic aperture radar (SAR) vehicle image information to enhance vehicle identification in unfamiliar and uncoherent environments. We cultivated and explored the proposed algorithm using the SDMS CV Data Domes repository of 14,430 augmented images per modality, equally represented over ten vehicle classes under similar and dissimilar vehicle pose augmentations with low to high levels of added testing set noise via zero-mean white Gaussian noise. Our results indicated that the local individual modality 3D convolution fusion of multiple poses and resolutions as well as dual-modality fusion of both polarizations enhanced the developed algorithm’s ability to classify SAR vehicle image information in unfamiliar pose, elevation angle and moderate to low noise environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.