In this paper, we address the challenging task of simultaneously optimizing (i) the weights of a neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection). While these problems are traditionally dealt with separately, we propose an efficient regularized formulation enabling their simultaneous parallel execution, using standard optimization routines. Specifically, we extend the group Lasso penalty, originally proposed in the linear regression literature, to impose group-level sparsity on the network's connections, where each group is defined as the set of outgoing weights from a unit. Depending on the specific case, the weights can be related to an input variable, to a hidden neuron, or to a bias unit, thus performing simultaneously all the aforementioned tasks in order to obtain a compact network. We carry out an extensive experimental evaluation, in comparison with classical weight decay and Lasso penalties, both on a toy dataset for handwritten digit recognition, and multiple realistic mid-scale classification benchmarks. Comparative results demonstrate the potential of our proposed sparse group Lasso penalty in producing extremely compact networks, with a significantly lower number of input features, with a classification accuracy which is equal or only slightly inferior to standard regularization terms
This paper introduces a new class of nonlinear adaptive filters, whose structure is based on Hammerstein model. Such filters derive from the functional link adaptive filter (FLAF) model, defined by a nonlinear input expansion, which enhances the representation of the input signal through a projection in a higher dimensional space, and a subsequent adaptive filtering. In particular, two robust FLAF-based architectures are proposed and designed ad hoc to tackle nonlinearities in acoustic echo cancellation (AEC). The simplest architecture is the split FLAF, which separates the adaptation of linear and nonlinear elements using two different adaptive filters in parallel. In this way, the architecture can accomplish distinctly at best the linear and the nonlinear modeling. Moreover, in order to give robustness against different degrees of nonlinearity, a collaborative FLAF is proposed based on the adaptive combination of filters. Such architecture allows to achieve the best performance regardless of the nonlinearity degree in the echo path. Experimental results show the effectiveness of the proposed FLAF-based architectures in nonlinear AEC scenarios, thus resulting an important solution to the modeling of nonlinear acoustic channels
The extreme learning machine (ELM) was recently proposed as a unifying framework for different families of learning algorithms. The classical ELM model consists of a linear combination of a fixed number of nonlinear expansions of the input vector. Learning in ELM is hence equivalent to finding the optimal weights that minimize the error on a dataset. The update works in batch mode, either with explicit feature mappings or with implicit mappings defined by kernels. Although an online version has been proposed for the former, no work has been done up to this point for the latter, and whether an efficient learning algorithm for online kernel-based ELM exists remains an open problem. By explicating some connections between nonlinear adaptive filtering and ELM theory, in this brief, we present an algorithm for this task. In particular, we propose a straightforward extension of the well-known kernel recursive least-squares, belonging to the kernel adaptive filtering (KAF) family, to the ELM framework. We call the resulting algorithm the kernel online sequential ELM (KOS-ELM). Moreover, we consider two different criteria used in the KAF field to obtain sparse filters and extend them to our context. We show that KOS-ELM, with their integration, can result in a highly efficient algorithm, both in terms of obtained generalization error and training time. Empirical evaluations demonstrate interesting results on some benchmarking datasets.
Learning from data in the quaternion domain enables us to exploit internal dependencies of 4D signals and treating them as a single entity. One of the models that perfectly suits with quaternion-valued data processing is represented by 3D acoustic signals in their spherical harmonics decomposition. In this paper, we address the problem of localizing and detecting sound events in the spatial sound field by using quaternion-valued data processing. In particular, we consider the spherical harmonic components of the signals captured by a first-order ambisonic microphone and process them by using a quaternion convolutional neural network. Experimental results show that the proposed approach exploits the correlated nature of the ambisonic signals, thus improving accuracy results in 3D sound event detection and localization.
Recently, a new class of nonlinear adaptive filtering architectures has been introduced based on the functional link adaptive filter (FLAF) model. Here we focus specifically on the split FLAF (SFLAF) architecture, which separates the adaptation of linear and nonlinear coefficients using two different adaptive filters in parallel. This property makes the SFLAF a well-suited method for problems like nonlinear acoustic echo cancellation (NAEC), in which the separation of filtering tasks brings some performance improvement. Although flexibility is one of the main features of the SFLAF, some problem may occur when the nonlinearity degree of the input signal is not known a priori. This implies a non-optimal choice of the number of coefficients to be adapted in the nonlinear path of the SFLAF. In order to tackle this problem, we propose a proportionate FLAF (PFLAF), which is based on sparse representations of functional links, thus giving less importance to those coefficients that do not actively contribute to the nonlinear modeling. Experimental results show that the proposed PFLAF achieves performance improvement with respect to the SFLAF in several nonlinear scenarios
In this paper two novel nonlinear cascade adaptive architectures, here called sandwich models, suitable for the identification of general nonlinear systems are presented. The proposed architectures rely on the combination of structural blocks, each one implementing a linear filter or a memoryless nonlinear function. All the nonlinear functions involved in the adaptation process are based on spline functions and can be easily modified during learning using gradient-based techniques.\ud In particular, a simple form of the on-line adaptation algorithms for the two architectures is derived. In addition, we analytically obtain a bound for the selection of the learning rates involved in the learning algorithms, in order to guarantee a convergence towards a minimum of the cost function. Finally, some experimental results demonstrate the effectiveness of the proposed method
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.