Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian. Using implicit differentiation, we show it is possible to leverage the non-smoothness of the inner problem to speed up the computation. Finally, we provide a bound on the error made on the hypergradient when the inner optimization problem is solved approximately. Results on regression and classification problems reveal computational benefits for hyperparameter optimization, especially when multiple hyperparameters are required.
This document presents the technical layout and the performance of the CLAS12 Forward Tagger (FT). The FT, composed of an electromagnetic calorimeter based on PbWO 4 crystals (FT-Cal), a scintillation hodoscope (FT-Hodo), and several layers of Micromegas trackers (FT-Trk), has been designed to detect electrons and photons scattered at polar angles from 2 • to 5 • and to meet the physics goals of the hadron spectroscopy program and other experiments running with the CLAS12 spectrometer in Hall B.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.