Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop
DOI: 10.1109/nnsp.1992.253713
|View full text |Cite
|
Sign up to set email alerts
|

Learning rate schedules for faster stochastic gradient search

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
163
1
2

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 192 publications
(179 citation statements)
references
References 8 publications
1
163
1
2
Order By: Relevance
“…Although gradient descent is apparently incompatible with our data, our results do not rule out a more complex learning process in which gradient descent is augmented with a simulated annealing process (Darken & Moody, 1992;Geman & Geman, 1984). Simulated annealing, which has its roots in statistical thermodynamics, would have the effect of shaking the dynamical trajectories as gradient descent forces them downhill.…”
Section: Discussioncontrasting
confidence: 92%
“…Although gradient descent is apparently incompatible with our data, our results do not rule out a more complex learning process in which gradient descent is augmented with a simulated annealing process (Darken & Moody, 1992;Geman & Geman, 1984). Simulated annealing, which has its roots in statistical thermodynamics, would have the effect of shaking the dynamical trajectories as gradient descent forces them downhill.…”
Section: Discussioncontrasting
confidence: 92%
“…A commonly used stepsize with such properties is Darken and Moody's (1992) Search-Then-Converge (STC) routine, which is updated as follows:…”
Section: Modified Search-then-converge (Stc) Stepsizementioning
confidence: 99%
“…The first modification adds a Search-Then-Converge (STC) subroutine (Darken and Moody, 1992) to update the algorithm's stepsize. However, instead of using the iteration count to update the the stepsize, as in the original application, I use the ratio of the the number of times a state has been visited on the simulation path to the iteration count.…”
mentioning
confidence: 99%
“…If the number of data is bigger than the number of cluster, for each data, we calculate the distance to all centroid and get the minimum distance. This data is said belong to the cluster that has minimum distance from this data [6].…”
Section: Working Of K-mean Clustering Algorithmmentioning
confidence: 99%