Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2017
DOI: 10.2316/journal.206.2017.5.206-5112
|View full text |Cite
|
Sign up to set email alerts
|

Model-Free Multi-Kernel Learning Control for Nonlinear Discrete-Time Systems

Abstract: Reinforcement learning (RL) has become an important research topic to solve learning control problems of nonlinear dynamic systems. In RL, feature representation is a critical factor for improving the performance of online or offline learning controllers. Although multikernel learning has been studied in supervised learning problems, there is little work on multi-kernel-based feature representation in RL algorithms. In this paper, a model-free multi-kernel learning control (MMLC) approach is proposed for a cla… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 17 publications
0
3
0
Order By: Relevance
“…Following the same line with [18], setting J ( s ) u ( s ) = 0, it is possible to compute the optimal control policy at time k as a function of the next state s +u )(s = 1 2 γ R 1 )( s + u )(s λ s + where λ = J / s is the costate that can be obtained as follows:right leftthickmathspace.5emλ ( s ) = r ( s , u ( s ) ) + γ J ( s + ) s = 2 Qs + γ )( s + s T λ ( s + ) As the system given by (1) is non‐linear, it is difficult to analytically calculate λ )(s that needs to be used in (4). To solve this problem, kernel‐based least‐squares iterative method has been widely used for policy and value evaluation in the framework of DHP.…”
Section: Problem Formulationmentioning
confidence: 99%
See 2 more Smart Citations
“…Following the same line with [18], setting J ( s ) u ( s ) = 0, it is possible to compute the optimal control policy at time k as a function of the next state s +u )(s = 1 2 γ R 1 )( s + u )(s λ s + where λ = J / s is the costate that can be obtained as follows:right leftthickmathspace.5emλ ( s ) = r ( s , u ( s ) ) + γ J ( s + ) s = 2 Qs + γ )( s + s T λ ( s + ) As the system given by (1) is non‐linear, it is difficult to analytically calculate λ )(s that needs to be used in (4). To solve this problem, kernel‐based least‐squares iterative method has been widely used for policy and value evaluation in the framework of DHP.…”
Section: Problem Formulationmentioning
confidence: 99%
“…Compared with single‐kernel designs, the adopted multi‐kernel structure consists of a linear combination of weighted single‐kernel functions, hence it is capable in reducing the complexity of kernel width parameter tuning for feature representation, especially for high dimensional and heterogeneous data‐samples. Furthermore, the multi‐kernel based feature representation is both used for optimal policy and VFA, which is different from the work in [18].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation