We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from O(1/ √ k) to O(1/k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1/k) to a linear convergence rate of the form O(ρ k ) for ρ < 1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. This extends our earlier work [Le Roux et al., 2012], which only lead to a faster rate for well-conditioned strongly-convex problems. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.
Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.
In this paper, we show a direct relation between spectral embedding methods and kernel PCA, and how both are special cases of a more general learning problem, that of learning the principal eigenfunctions of an operator defined from a kernel and the unknown data generating density.
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Key points• Parvalbumin-expressing interneurons represent a major source of inhibition of CA1 hippocampal principal cells and influence both spike timing precision and network oscillations.• These interneurons receive both feed-forward and feedback excitatory inputs which recruit them in the hippocampal network.• In this study, we compared the functional properties of these two inputs and how they may be modified by neuronal activity.• We show that calcium-permeable AMPA receptors and NMDA receptors are differentially distributed at feed-forward versus feedback inputs and act as coincidence detectors of opposing modalities.• Our results reveal that the two major excitatory inputs onto CA1 parvalbumin-expressing interneurons undergo long term plasticity with different frequency regimes of afferent activity, which is likely to influence their function under both normal and pathological conditions.Abstract Hippocampal parvalbumin-expressing interneurons (PV INs) provide fast and reliable GABAergic signalling to principal cells and orchestrate hippocampal ensemble activities. Precise coordination of principal cell activity by PV INs relies in part on the efficacy of excitatory afferents that recruit them in the hippocampal network. Feed-forward (FF) inputs in particular from Schaffer collaterals influence spike timing precision in CA1 principal cells whereas local feedback (FB) inputs may contribute to pacemaker activities. Although PV INs have been shown to undergo activity-dependent long term plasticity, how both inputs are modulated during principal cell firing is unknown. Here we show that FF and FB synapses onto PV INs are endowed with distinct postsynaptic glutamate receptors which set opposing long-term plasticity rules. Inward-rectifying AMPA receptors (AMPARs) expressed at both FF and FB inputs mediate a form of anti-Hebbian long term potentiation (LTP), relying on coincident membrane hyperpolarization and synaptic activation. In contrast, FF inputs are largely devoid of NMDA receptors (NMDARs) which are more abundant at FB afferents and confer on them an additional form of LTP with Hebbian properties. Both forms of LTP are expressed with no apparent change in presynaptic function. The specific endowment of FF and FB inputs with distinct coincidence detectors allow them to be differentially tuned upon high frequency afferent activity. Thus, high frequency (>20 Hz) stimulation specifically potentiates FB, but not FF afferents. We propose that these differential,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.