Actor-Double-Critic: Incorporating Model-Based Critic for Task-Oriented Dialogue Systems

Wu, Yen-Chen; Tseng, Bo-Hsiang; Gašić, Milica

doi:10.18653/v1/2020.findings-emnlp.75

Cited by 2 publications

(1 citation statement)

References 29 publications

(19 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Model-free reinforcement learning methods interact directly with pre-built environments or real users to learn dialogue policies [4]. Model-based reinforcement learning is comprised of two simultaneous learning modules: model learning and policy learning [5].…”

Section: Related Work 21 Task-oriented Dialogue Systemsmentioning

confidence: 99%

Task-based dialogue policy learning based on decision transformer

Zhou¹,

Liu²,

Liu³

2023

International Conference on Internet of Things and Machine Learning (IoTML 2022)

View full text Add to dashboard Cite

Due to interactions with real users, online reinforcement learning training projects for dialogue agents are expensive. User simulator is an alternative method that is commonly used. However, the environment of a user simulator is not identical to that of a real user, and it cannot provide the atypical and more variegated conversational behavior that is a hallmark of human spontaneity. We employ offline reinforcement learning and Transformer to abstract dialogue policy as a framework for sequence modeling problems, modeling the joint distribution of state, action, and reward sequences to generate optimal dialogue actions. An evaluation of the Multiwoz dataset shows that DT successfully improves the efficiency of DRL dialogue agents and improves dialogue robustness.

show abstract

Section: Related Work 21 Task-oriented Dialogue Systemsmentioning

confidence: 99%

Task-based dialogue policy learning based on decision transformer

Zhou¹,

Liu²,

Liu³

2023

International Conference on Internet of Things and Machine Learning (IoTML 2022)

View full text Add to dashboard Cite

show abstract

Imagination-Augmented Reinforcement Learning Framework for Variable Speed Limit Control

Li,

Lasenby

2024

IEEE Trans. Intell. Transport. Syst.

View full text Add to dashboard Cite

Variable Speed Limit (VSL) is a commonly applied active traffic management measure for urban motorways. In recent years, model-based and model-free approaches have been extensively adopted to solve VSL optimization problems. However, the success of model-based VSL relies heavily on the nature of the environmental model adopted (e.g., traffic flow model). Implicit environment models may result in inappropriate control actions. Although model-free approaches are able to directly map raw measurements to control actions without a need for an environment model, they usually require large amounts of training data. In order to address these issues, we propose an Imagination-Augmented Agent (I2A) for VSL control. The I2A consists an imagination path and a model-free path, which work together to generate appropriate control actions. The simulation results show that the proposed I2A agent outperforms other tested Reinforcement Learning (RL) agents in terms of Total Time Spent and bottleneck volume.

show abstract

Actor-Double-Critic: Incorporating Model-Based Critic for Task-Oriented Dialogue Systems

Cited by 2 publications

References 29 publications

Task-based dialogue policy learning based on decision transformer

Task-based dialogue policy learning based on decision transformer

Imagination-Augmented Reinforcement Learning Framework for Variable Speed Limit Control

Contact Info

Product

Resources

About