2022
DOI: 10.1609/aaai.v36i7.20730
|View full text |Cite
|
Sign up to set email alerts
|

Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy

Abstract: Dealing with real-world reinforcement learning (RL) tasks, we shall be aware that the environment may have sudden changes. We expect that a robust policy is able to handle such changes and adapt to the new environment rapidly. Context-based meta reinforcement learning aims at learning environment adaptable policies. These methods adopt a context encoder to perceive the environment on-the-fly, following which a contextual policy makes environment adaptive decisions according to the context. However, previous… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 11 publications
(15 reference statements)
0
9
0
Order By: Relevance
“…Another one of our goals is to find an algorithm that can perform well in all the environments with little modification, which can be useful if a plant decides to change a reactor tank, or we want to adopt a trained algorithm for a new plant. We would like to develop existing meta-learning algorithms like [49,50].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Another one of our goals is to find an algorithm that can perform well in all the environments with little modification, which can be useful if a plant decides to change a reactor tank, or we want to adopt a trained algorithm for a new plant. We would like to develop existing meta-learning algorithms like [49,50].…”
Section: Discussionmentioning
confidence: 99%
“…The adsorption kinetic model shown in Equations (50), (51) and Equation ( 52) can also be used to describe the CEX and AEX chromatography process. The same rule applies to the boundary conditions.…”
Section: A3143 Cex and Aex Chromatographymentioning
confidence: 99%
“…Moreover, we have noticed that the generalizable reward function can lead to a better policy in the target task, which introduces an alternative way for transfer reinforcement learning other than previous policy-based transfer methods (e.g. [38,39]). We will also explore reward-based transfer reinforcement learning methods.…”
Section: Discussionmentioning
confidence: 99%
“…Meta reinforcement learning [Duan et al, 2016, Houthooft et al, 2018 studies the methodologies that enable the agent to generalize across different tasks with few-shot samples in the target tasks. In this process, we have a set of tasks for policy training, but the deployed tasks are unknown, can be OOD compared with the distribution of the training tasks [Lee et al, 2020a], and even can be varied when deployed [Luo et al, 2022]. Tasks have different definitions in different scenarios, e.g., differences in reward functions [Finn et al, 2017a, Rothfuss et al, 2019, or parameters of dynamics [Peng et al, 2018, Zhang et al, 2018a.…”
Section: Meta Rlmentioning
confidence: 99%