Constraint Sampling Reinforcement Learning: Incorporating Expertise for Faster Learning

Mu, Tong; Theocharous, Georgios; Arbour, David; Brunskill, Emma

doi:10.1609/aaai.v36i7.20753

Cited by 4 publications

(2 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other work has investigated deep RL techniques with online RL techniques. Mu et al (2022) used deep knowledge tracing to generate synthetic data for online deep RL-based pedagogical planning. Similar approaches have been taken by others (Bassen et al 2020;Zhang et al 2022).…”

Section: Related Workmentioning

confidence: 99%

Online Reinforcement Learning-Based Pedagogical Planning for Narrative-Centered Learning Environments

Fahid,

Rowe,

Kim

et al. 2024

AAAI

View full text Add to dashboard Cite

Pedagogical planners can provide adaptive support to students in narrative-centered learning environments by dynamically scaffolding student learning and tailoring problem scenarios. Reinforcement learning (RL) is frequently used for pedagogical planning in narrative-centered learning environments. However, RL-based pedagogical planning raises significant challenges due to the scarcity of data for training RL policies. Most prior work has relied on limited-size datasets and offline RL techniques for policy learning. Unfortunately, offline RL techniques do not support on-demand exploration and evaluation, which can adversely impact the quality of induced policies. To address the limitation of data scarcity and offline RL, we propose INSIGHT, an online RL framework for training data-driven pedagogical policies that optimize student learning in narrative-centered learning environments. The INSIGHT framework consists of three components: a narrative-centered learning environment simulator, a simulated student agent, and an RL-based pedagogical planner agent, which uses a reward metric that is associated with effective student learning processes. The framework enables the generation of synthetic data for on-demand exploration and evaluation of RL-based pedagogical planning. We have implemented INSIGHT with OpenAI Gym for a narrative-centered learning environment testbed with rule-based simulated student agents and a deep Q-learning-based pedagogical planner. Our results show that online deep RL algorithms can induce near-optimal pedagogical policies in the INSIGHT framework, while offline deep RL algorithms only find suboptimal policies even with large amounts of data.

show abstract

Section: Related Workmentioning

confidence: 99%

Online Reinforcement Learning-Based Pedagogical Planning for Narrative-Centered Learning Environments

Fahid,

Rowe,

Kim

et al. 2024

AAAI

View full text Add to dashboard Cite

show abstract

“…Expertise Incorporated Learning: In order to advance the use of domain adaptation for wireless networks, we consider constraint sampling reinforcement learning (CSRL) [98] as a promising method to quantify the reality gap using domain expertise. In this way, expert knowledge of the networking environment can be integrated during the training process via sensing [51], [55] to enable effective policy transfer in unknown environments with minimal human interaction.…”

Section: B Research Opportunitiesmentioning

confidence: 99%

Digital Twin-Enabled Domain Adaptation for Zero-Touch UAV Networks: Survey and Challenges

McManus¹,

Cui²,

Josh³

et al. 2023

Preprint

View full text Add to dashboard Cite

In existing wireless networks, the control programs have been designed manually and for certain predefined scenarios. This process is complicated and error-prone, and the resulting control programs are not resilient to disruptive changes. Data-driven control based on Artificial Intelligence and Machine Learning (AI/ML) has been envisioned as a key technique to automate the modeling, optimization and control of complex wireless systems. However, existing AI/ML techniques rely on sufficient well-labeled data and may suffer from slow convergence and poor generalizability. In this article, focusing on digital twinassisted wireless unmanned aerial vehicle (UAV) systems, we provide a survey of emerging techniques that can enable fastconverging data-driven control of wireless systems with enhanced generalization capability to new environments. These include SLAM-based sensing and network softwarization for digital twin construction, robust reinforcement learning and system identification for domain adaptation, and testing facility sharing and federation. The corresponding research opportunities are also discussed.

show abstract

Digital twin-enabled domain adaptation for zero-touch UAV networks: Survey and challenges

McManus,

Cui,

Zhang

et al. 2023

Computer Networks

View full text Add to dashboard Cite

Constraint Sampling Reinforcement Learning: Incorporating Expertise for Faster Learning

Cited by 4 publications

References 25 publications

Online Reinforcement Learning-Based Pedagogical Planning for Narrative-Centered Learning Environments

Online Reinforcement Learning-Based Pedagogical Planning for Narrative-Centered Learning Environments

Digital Twin-Enabled Domain Adaptation for Zero-Touch UAV Networks: Survey and Challenges

Digital twin-enabled domain adaptation for zero-touch UAV networks: Survey and challenges

Contact Info

Product

Resources

About