There is an ongoing debate over the capabilities of hierarchical neural feedforward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense research. We propose a feedforward model for recognition that shares components like weight sharing, pooling stages, and competitive nonlinearities with earlier approaches but focuses on new methods for learning optimal feature-detecting cells in intermediate stages of the hierarchical network. We show that principles of sparse coding, which were previously mostly applied to the initial feature detection stages, can also be employed to obtain optimized intermediate complex features. We suggest a new approach to optimize the learning of sparse features under the constraints of a weight-sharing or convolutional architecture that uses pooling operations to achieve gradual invariance in the feature hierarchy. The approach explicitly enforces symmetry constraints like translation invariance on the feature set. This leads to a dimension reduction in the search space of optimal features and allows determining more efficiently the basis representatives, which achieve a sparse decomposition of the input. We analyze the quality of the learned feature representation by investigating the recognition performance of the resulting hierarchical network on object and face databases. We show that a hierarchy with features learned on a single object data set can also be applied to face recognition without parameter changes and is competitive with other recent machine learning recognition approaches. To investigate the effect of the interplay between sparse coding and processing nonlinearities, we also consider alternative feedforward pooling nonlinearities such as presynaptic maximum selection and sum-of-squares integration. The comparison shows that a combination of strong competitive nonlinearities with sparse coding offers the best recognition performance in the difficult scenario of segmentation-free recognition in cluttered surround. We demonstrate that for both learning and recognition, a precise segmentation of the objects is not necessary.
This paper proposes a biologically inspired and technically implemented sound localization system to robustly estimate the position of a sound source in the frontal azimuthal half-plane. For localization, binaural cues are extracted using cochleagrams generated by a cochlear model that serve as input to the system. The basic idea of the model is to separately measure interaural time differences and interaural level differences for a number of frequencies and process these measurements as a whole. This leads to two-dimensional frequency versus time-delay representations of binaural cues, so-called activity maps. A probabilistic evaluation is presented to estimate the position of a sound source over time based on these activity maps. Learned reference maps for different azimuthal positions are integrated into the computation to gain time-dependent discrete conditional probabilities. At every timestep these probabilities are combined over frequencies and binaural cues to estimate the sound source position. In addition, they are propagated over time to improve position estimation. This leads to a system that is able to localize audible signals, for example human speech signals, even in reverberating environments.
Although already William James and, more explicitly, Donald Hebb's theory of cell assemblies have suggested that activity-dependent rewiring of neuronal networks is the substrate of learning and memory, over the last six decades most theoretical work on memory has focused on plasticity of existing synapses in prewired networks. Research in the last decade has emphasized that structural modification of synaptic connectivity is common in the adult brain and tightly correlated with learning and memory. Here we present a parsimonious computational model for learning by structural plasticity. The basic modeling units are “potential synapses” defined as locations in the network where synapses can potentially grow to connect two neurons. This model generalizes well-known previous models for associative learning based on weight plasticity. Therefore, existing theory can be applied to analyze how many memories and how much information structural plasticity can store in a synapse. Surprisingly, we find that structural plasticity largely outperforms weight plasticity and can achieve a much higher storage capacity per synapse. The effect of structural plasticity on the structure of sparsely connected networks is quite intuitive: Structural plasticity increases the “effectual network connectivity”, that is, the network wiring that specifically supports storage and recall of the memories. Further, this model of structural plasticity produces gradients of effectual connectivity in the course of learning, thereby explaining various cognitive phenomena including graded amnesia, catastrophic forgetting, and the spacing effect.
Spike synchronization is thought to have a constructive role for feature integration, attention, associative learning, and the formation of bidirectionally connected Hebbian cell assemblies. By contrast, theoretical studies on spike-timing-dependent plasticity (STDP) report an inherently decoupling influence of spike synchronization on synaptic connections of coactivated neurons. For example, bidirectional synaptic connections as found in cortical areas could be reproduced only by assuming realistic models of STDP and rate coding. We resolve this conflict by theoretical analysis and simulation of various simple and realistic STDP models that provide a more complete characterization of conditions when STDP leads to either coupling or decoupling of neurons firing in synchrony. In particular, we show that STDP consistently couples synchronized neurons if key model parameters are matched to physiological data: First, synaptic potentiation must be significantly stronger than synaptic depression for small (positive or negative) time lags between presynaptic and postsynaptic spikes. Second, spike synchronization must be sufficiently imprecise, for example, within a time window of 5–10 ms instead of 1 ms. Third, axonal propagation delays should not be much larger than dendritic delays. Under these assumptions synchronized neurons will be strongly coupled leading to a dominance of bidirectional synaptic connections even for simple STDP models and low mean firing rates at the level of spontaneous activity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.