Analogies to stochastic optimization are common in developmental psychology, describing a gradual reduction in randomness over the lifespan. Yet for lack of concrete empirical comparison, there is ambiguity in how to interpret this analogy. Using data from n=281 participants ages 5 to 55, we show that "cooling off'" does not only apply to the single dimension of randomness. Rather, development resembles a stochastic optimization process in the space of learning strategies, which we characterize along key dimensions of reward generalization, uncertainty-directed exploration, and random temperature. What begins as large tweaks in the parameters that define learning during childhood, plateaus and converges in adulthood. The developmental trajectory of human parameters is strikingly similar to several stochastic optimization algorithms, yet we begin to observe a divergence around adolescence. Remarkably, none of the optimization algorithms discovered reliably better regions of the strategy space than adult participants, suggesting an incredible efficiency of human development.