2022
DOI: 10.48550/arxiv.2201.08536
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Instance-Dependent Confidence and Early Stopping for Reinforcement Learning

Abstract: Various algorithms for reinforcement learning (RL) exhibit dramatic variation in their convergence rates as a function of problem structure. Such problem-dependent behavior is not captured by worst-case analyses and has accordingly inspired a growing effort in obtaining instance-dependent guarantees and deriving instance-optimal algorithms for RL problems. This research has been carried out, however, primarily within the confines of theory, providing guarantees that explain ex post the performance differences … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 4 publications
0
1
0
Order By: Relevance
“…The feedback obtained from the proposed framework can be further exploited for example for safe early stopping [9]. Moreover, by choosing a set of high-level operations from hyperparameter tuning to algorithm selection set, to guide an agent to perform various tasks, like remembering history, comparing and contrasting current and past inputs, and using learning methods to change its own learning methods, the proposed approach can be considered a first step towards Meta-Learning [27].…”
Section: Concluding Remarks and Future Workmentioning
confidence: 99%
“…The feedback obtained from the proposed framework can be further exploited for example for safe early stopping [9]. Moreover, by choosing a set of high-level operations from hyperparameter tuning to algorithm selection set, to guide an agent to perform various tasks, like remembering history, comparing and contrasting current and past inputs, and using learning methods to change its own learning methods, the proposed approach can be considered a first step towards Meta-Learning [27].…”
Section: Concluding Remarks and Future Workmentioning
confidence: 99%