Nicolas Le Roux scite author profile

We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from O(1/ √ k) to O(1/k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1/k) to a linear convergence rate of the form O(ρ k ) for ρ < 1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. This extends our earlier work [Le Roux et al., 2012], which only lead to a faster rate for well-conditioned strongly-convex problems. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

show abstract

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks

Roux

2008

View full text Add to dashboard Cite

Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.

show abstract

Ask the locals: Multi-way local pooling for image recognition

et al. 2011

View full text Add to dashboard Cite

show abstract

Learning Eigenfunctions Links Spectral Embedding and Kernel PCA

et al. 2004

View full text Add to dashboard Cite

show abstract

Minimizing Finite Sums with the Stochastic Average Gradient

Schmidt¹,

Roux²,

Bach³

2013

Preprint

165

View full text Add to dashboard Cite

Groundwater flow and heat transport for systems undergoing freeze-thaw: Intercomparison of numerical simulators for 2D test cases

Grenier

Anbergen²,

Bense

et al. 2018

Advances in Water Resources

104

103

View full text Add to dashboard Cite

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

show abstract

Input‐specific learning rules at excitatory synapses onto hippocampal parvalbumin‐expressing interneurons

Roux¹,

Cabezas²,

Böhm³

et al. 2013

The Journal of Physiology

View full text Add to dashboard Cite

Key points• Parvalbumin-expressing interneurons represent a major source of inhibition of CA1 hippocampal principal cells and influence both spike timing precision and network oscillations.• These interneurons receive both feed-forward and feedback excitatory inputs which recruit them in the hippocampal network.• In this study, we compared the functional properties of these two inputs and how they may be modified by neuronal activity.• We show that calcium-permeable AMPA receptors and NMDA receptors are differentially distributed at feed-forward versus feedback inputs and act as coincidence detectors of opposing modalities.• Our results reveal that the two major excitatory inputs onto CA1 parvalbumin-expressing interneurons undergo long term plasticity with different frequency regimes of afferent activity, which is likely to influence their function under both normal and pathological conditions.Abstract Hippocampal parvalbumin-expressing interneurons (PV INs) provide fast and reliable GABAergic signalling to principal cells and orchestrate hippocampal ensemble activities. Precise coordination of principal cell activity by PV INs relies in part on the efficacy of excitatory afferents that recruit them in the hippocampal network. Feed-forward (FF) inputs in particular from Schaffer collaterals influence spike timing precision in CA1 principal cells whereas local feedback (FB) inputs may contribute to pacemaker activities. Although PV INs have been shown to undergo activity-dependent long term plasticity, how both inputs are modulated during principal cell firing is unknown. Here we show that FF and FB synapses onto PV INs are endowed with distinct postsynaptic glutamate receptors which set opposing long-term plasticity rules. Inward-rectifying AMPA receptors (AMPARs) expressed at both FF and FB inputs mediate a form of anti-Hebbian long term potentiation (LTP), relying on coincident membrane hyperpolarization and synaptic activation. In contrast, FF inputs are largely devoid of NMDA receptors (NMDARs) which are more abundant at FB afferents and confer on them an additional form of LTP with Hebbian properties. Both forms of LTP are expressed with no apparent change in presynaptic function. The specific endowment of FF and FB inputs with distinct coincidence detectors allow them to be differentially tuned upon high frequency afferent activity. Thus, high frequency (>20 Hz) stimulation specifically potentiates FB, but not FF afferents. We propose that these differential,

show abstract

The social shortfall and ecological overshoot of nations

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nicolas Le Roux

Minimizing finite sums with the stochastic average gradient

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks

Ask the locals: Multi-way local pooling for image recognition

Learning Eigenfunctions Links Spectral Embedding and Kernel PCA

Minimizing Finite Sums with the Stochastic Average Gradient

Groundwater flow and heat transport for systems undergoing freeze-thaw: Intercomparison of numerical simulators for 2D test cases

Input‐specific learning rules at excitatory synapses onto hippocampal parvalbumin‐expressing interneurons

The social shortfall and ecological overshoot of nations

Contact Info

Product

Resources

About