2023
DOI: 10.1016/j.automatica.2023.111183
|View full text |Cite
|
Sign up to set email alerts
|

Training recurrent neural networks by sequential least squares and the alternating direction method of multipliers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 28 publications
0
1
0
Order By: Relevance
“…The second-order methods, however, can be used to mitigate these drawbacks by utilizing, in their learning procedures, the information contained in the curvature of the loss surface, which provably accelerates convergence. The second-order methods for learning RNNs have historically been used in two ways: 1) the use of second-order optimization algorithms, such as generalized Gauss-Newton (GGN), Levenberg-Marquardt (LM), and conjugate gradient (CG) algorithms (see [12], [13], [14], [15], [16], [17], [18], [19]) and 2) using nonlinear sequential stateestimation techniques, such as extended Kalman filter (EKF) method (see [20], [21], [22], [23], [24], [25]). In the first approach, the second-order information is captured through Hessian (or approximate Hessian) computations, while in the second approach, the second-order information is computed recursively as a prediction-error covariance matrix.…”
Section: Introductionmentioning
confidence: 99%
“…The second-order methods, however, can be used to mitigate these drawbacks by utilizing, in their learning procedures, the information contained in the curvature of the loss surface, which provably accelerates convergence. The second-order methods for learning RNNs have historically been used in two ways: 1) the use of second-order optimization algorithms, such as generalized Gauss-Newton (GGN), Levenberg-Marquardt (LM), and conjugate gradient (CG) algorithms (see [12], [13], [14], [15], [16], [17], [18], [19]) and 2) using nonlinear sequential stateestimation techniques, such as extended Kalman filter (EKF) method (see [20], [21], [22], [23], [24], [25]). In the first approach, the second-order information is captured through Hessian (or approximate Hessian) computations, while in the second approach, the second-order information is computed recursively as a prediction-error covariance matrix.…”
Section: Introductionmentioning
confidence: 99%