2023
DOI: 10.1109/tac.2022.3222750
|View full text |Cite
|
Sign up to set email alerts
|

Recurrent Neural Network Training With Convex Loss and Regularization Functions by Extended Kalman Filtering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 30 publications
0
5
0
Order By: Relevance
“…Thus, the performances of a standard GRU and of a physics-informed GRU (PI-GRU) model are now compared. These are trained with the 15 690-sample dataset over 1500 epochs, with the ADAM optimizer and a learning rate of 0.003, so as to get a good trade-off between convergence speed and excessive oscillations avoidance [6].…”
Section: A Identification Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Thus, the performances of a standard GRU and of a physics-informed GRU (PI-GRU) model are now compared. These are trained with the 15 690-sample dataset over 1500 epochs, with the ADAM optimizer and a learning rate of 0.003, so as to get a good trade-off between convergence speed and excessive oscillations avoidance [6].…”
Section: A Identification Resultsmentioning
confidence: 99%
“…Please note that v = v [1] = v [2] = T s 0 , being the supply temperatures at nodes α 1 and α 2 directly affected by the one at the heating station node α 0 , which is the overall system input. Moreover, v [3] = y [1] s , being α 1 the only preceding significant node for α 3 , v [4] = v [5] = [y [2] ′ s , y [3] ′ s ] ′ , being α 2 and α 3 the preceding significant nodes for α 4 and α 5 , whereas u [6] = [y [1] ′ r , y [2] ′ r , y [3] ′ r , y [4] ′ r , y [5] ′ r ] ′ , since the return-associated RNN is fed with the load output temperatures and water flows which are outputs for the five load-associated RNNs.…”
Section: B Physics-informed Recurrent Neural Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…Then, each of the step k ∈ K causes a sufficient decrease in M 1 by the backtracking line-search procedure. Moreover, with M 1 (z k ; ρ) ≥ K > −∞ for some constant K , the sufficient decrease condition (25) shows that…”
Section: Globalization Of Isqprl By a Line Searchmentioning
confidence: 99%
“…The second-order methods, however, can be used to mitigate these drawbacks by utilizing, in their learning procedures, the information contained in the curvature of the loss surface, which provably accelerates convergence. The second-order methods for learning RNNs have historically been used in two ways: 1) the use of second-order optimization algorithms, such as generalized Gauss-Newton (GGN), Levenberg-Marquardt (LM), and conjugate gradient (CG) algorithms (see [12], [13], [14], [15], [16], [17], [18], [19]) and 2) using nonlinear sequential stateestimation techniques, such as extended Kalman filter (EKF) method (see [20], [21], [22], [23], [24], [25]). In the first approach, the second-order information is captured through Hessian (or approximate Hessian) computations, while in the second approach, the second-order information is computed recursively as a prediction-error covariance matrix.…”
Section: Introductionmentioning
confidence: 99%