Stacking-based deep neural network (S-DNN) is aggregated with pluralities of basic learning modules, one after another, to synthesize a deep neural network (DNN) alternative for pattern classification. Contrary to the DNNs trained from end to end by backpropagation (BP), each S-DNN layer, that is, a selflearnable module, is to be trained decisively and independently without BP intervention. In this paper, a ridge regression-based S-DNN, dubbed deep analytic network (DAN), along with its kernelization (K-DAN), are devised for multilayer feature relearning from the pre-extracted baseline features and the structured features. Our theoretical formulation demonstrates that DAN/K-DAN relearn by perturbing the intra/interclass variations, apart from diminishing the prediction errors. We scrutinize the DAN/K-DAN performance for pattern classification on datasets of varying domains-faces, handwritten digits, generic objects, to name a few. Unlike the typical BP-optimized DNNs to be trained from gigantic datasets by GPU, we reveal that DAN/K-DAN are trainable using only CPU even for small-scale training sets. Our experimental results show that DAN/K-DAN outperform the present S-DNNs and also the BP-trained DNNs, including multiplayer perceptron, deep belief network, etc., without data augmentation applied. Index Terms-Deep analytic network (DAN), face recognition, object recognition, pattern classification, stacking-based deep neural network (S-DNN).
This paper devises a new means of filter diversification, dubbed multi-fold filter convolution ( -FFC), for face recognition. On the assumption that -FFC receives singlescale Gabor filters of varying orientations as input, these filters are self-cross convolved by -fold to instantiate a filter offspring set. The -FFC flexibility also permits cross convolution amongst Gabor filters and other filter banks of profoundly dissimilar traits, e.g., principal component analysis (PCA) filters, and independent component analysis (ICA) filters. The 2-FFC of Gabor, PCA and ICA filters thus yields three offspring sets: (1) Gabor filters solely, (2) Gabor-PCA filters, and (3) Gabor-ICA filters, to render the learning-free and the learning-based 2-FFC descriptors. To facilitate a sensible Gabor filter selection for -FFC, the 40 multiscale, multi-orientation Gabor filters are condensed into 8 elementary filters. Aside from that, an average histogram pooling operator is employed to leverage the -FFC histogram features, prior to the final whitening PCA compression. The empirical results substantiate that the 2-FFC descriptors prevail over, or on par with, other face descriptors on both identification and verification tasks. Index Terms-Gaborfilters, PCA filters, ICA filters, filter convolution, face recognition Hong Kong learns from approximately 300,000 images with 13,000 identities; FaceNet [5] by Google trains CNNs from 200M images spanning over 8M identities. These prevailing CNN models, particularly DeepID3 and FaceNet, reportedly achieve accuracies of 99.53% and 99.63%, respectively, on the labeled faces in the wild (LFW) dataset [41], surpassing the human-level performance of 97.53%. On the contrary, the FB approaches, e.g., PCANet [14], discriminant face descriptor (DFD) [15], compact binary face descriptor (CBFD) [16], binarized statistical image features (BSIF) [17-18], DCTNet [20], etc., are typically equipped with a single or two filtering layers. Despite of being simple and easy of use, these CNN simplifications promise the state of the art robustness to the generic image classification problems including face.The earliest FB approaches are reviewed and compared in [6]. They share a common three-stage pipeline, referred to as filter-rectify-filter (FRF): (1) a convolutional stage based on the heuristically designed filter banks, e.g., Laws masks, ring and wedge filters, Gabor filters, wavelet transform, packets and frames, discrete cosine transform (DCT), etc.; or other optimal filters, e.g., principal component analysis (PCA) eigenfilters, Karhunen-Loeve transform, prediction error filters, optimized Gabor filters, etc., (2) a nonlinearity, a. k. a filter response rectification step, e.g., magnitude, squaring, rectified sigmoid, etc., (3) pooling (filtering) operations, e.g., spatial averaging, smoothing, or nonlinear inhibition, to remove the inhomogeneity in the rectified responses within a homogenous region. The local energy function, includes stage (2) and (3), outputs a set of feature images, one per filter, def...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.