Lucie Daubigney scite author profile

Lucie Daubigney

2Publications

50Citation Statements Received

55Citation Statements Given

How they've been cited

How they cite others

Affiliations

Lorraine Research Laboratory in Computer Science and its Applications, Supélec, Institut Polytechnique de Bordeaux

Publications

Order By: Most citations

A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization

Daubigney

Geist

Chandramohan

et al. 2012

IEEE J. Sel. Top. Signal Process.

View full text Add to dashboard Cite

Reinforcement learning is now an acknowledged approach for optimizing the interaction strategy of spoken dialogue systems. If the first considered algorithms were quite basic (like SARSA), recent works concentrated on more sophisticated methods. More attention has been paid to off-policy learning, dealing with the exploration-exploitation dilemma, sample efficiency or handling non-stationarity. New algorithms have been proposed to address these issues and have been applied to dialogue management. However, each algorithm often solves a single issue at a time, while dialogue systems exhibit all the problems at once. In this paper, we propose to apply the Kalman Temporal Differences (KTD) framework to the problem of dialogue strategy optimization so as to address all these issues in a comprehensive manner with a single framework. Our claims are illustrated by experiments led on two real-world goal-oriented dialogue management frameworks, DIPPER and HIS.Index Terms-Dialogue management, reinforcement learning, spoken dialogue system.

show abstract

Off-policy learning in large-scale POMDP-based dialogue systems

Daubigney

Geist

Pietquin

2012

View full text Add to dashboard Cite

Reinforcement learning (RL) is now part of the state of the art in the domain of spoken dialogue systems (SDS) optimisation. Most performant RL methods, such as those based on Gaussian Processes, require to test small changes in the policy to assess them as improvements or degradations. This process is called on policy learning. Nevertheless, it can result in system behaviours that are not acceptable by users. Learning algorithms should ideally infer an optimal strategy by observing interactions generated by a non-optimal but acceptable strategy, that is learning off-policy. Such methods usually fail to scale up and are thus not suited for real-world systems. In this contribution, a sample-efficient, online and off-policy RL algorithm is proposed to learn an optimal policy. This algorithm is combined to a compact non-linear value function representation (namely a multilayers perceptron) enabling to handle large scale systems.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lucie Daubigney

A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization

Off-policy learning in large-scale POMDP-based dialogue systems

Contact Info

Product

Resources

About