Learning rate schedules for faster stochastic gradient search

Darken, Christian J.; Chang, Jyh‐Yeong; Moody, John

doi:10.1109/nnsp.1992.253713

Cited by 192 publications

(179 citation statements)

References 8 publications

Supporting

Mentioning

163

Contrasting

Unclassified

Order By: Relevance

“…Although gradient descent is apparently incompatible with our data, our results do not rule out a more complex learning process in which gradient descent is augmented with a simulated annealing process (Darken & Moody, 1992;Geman & Geman, 1984). Simulated annealing, which has its roots in statistical thermodynamics, would have the effect of shaking the dynamical trajectories as gradient descent forces them downhill.…”

Section: Discussioncontrasting

confidence: 92%

Dynamical trajectories in category learning

Ell

Ashby

2004

Perception & Psychophysics

View full text Add to dashboard Cite

Category learning has traditionally been studied by examining how percentage correct changes with experience (i.e., in the form of learning curves). An alternative and more powerful approach is to examine dynamical learning trajectories-that is, to examine how the parameters that describe the current state of the model change with experience. We describe results from a new experimental paradigm in which empirical-learning trajectories are directly observable. In these experiments, participants learned two categories of spatial position, and they were constrained to identify and use a linear decision bound on every trial. The dependent variables of principal interest were the slope and the intercept of the bound used on each trial. Data from two experiments supported the following conclusions.(1) Gradient descent provided a poor description of the empirical trajectories. (2) The magnitude of changes in decision strategy decreased with experience at a rate that was faster than that predicted by gradient descent. (3) Learning curves suffered from substantial identifiability problems.

show abstract

Section: Discussioncontrasting

confidence: 92%

Dynamical trajectories in category learning

Ell

Ashby

2004

Perception & Psychophysics

View full text Add to dashboard Cite

show abstract

“…A commonly used stepsize with such properties is Darken and Moody's (1992) Search-Then-Converge (STC) routine, which is updated as follows:…”

Section: Modified Search-then-converge (Stc) Stepsizementioning

confidence: 99%

“…The first modification adds a Search-Then-Converge (STC) subroutine (Darken and Moody, 1992) to update the algorithm's stepsize. However, instead of using the iteration count to update the the stepsize, as in the original application, I use the ratio of the the number of times a state has been visited on the simulation path to the iteration count.…”

mentioning

confidence: 99%

Approximate dynamic programming with post-decision states as a solution method for dynamic economic models

Hull

2015

Journal of Economic Dynamics and Control

View full text Add to dashboard Cite

“…If the number of data is bigger than the number of cluster, for each data, we calculate the distance to all centroid and get the minimum distance. This data is said belong to the cluster that has minimum distance from this data [6].…”

Section: Working Of K-mean Clustering Algorithmmentioning

confidence: 99%