We study the problem of inverse reinforcement learning (IRL) with the added twist that the learner is assisted by a helpful teacher. More formally, we tackle the following algorithmic question: How could a teacher provide an informative sequence of demonstrations to an IRL learner to speed up the learning process? We present an interactive teaching framework where a teacher adaptively chooses the next demonstration based on learner's current policy. In particular, we design teaching algorithms for two concrete settings: an omniscient setting where a teacher has full knowledge about the learner's dynamics and a blackbox setting where the teacher has minimal knowledge. Then, we study a sequential variant of the popular MCE-IRL learner and prove convergence guarantees of our teaching algorithm in the omniscient setting. Extensive experiments with a car driving simulator environment show that the learning progress can be speeded up drastically as compared to an uninformative teacher.
We consider the machine teaching problem in a classroom-like setting wherein the teacher has to deliver the same examples to a diverse group of students. Their diversity stems from differences in their initial internal states as well as their learning rates. We prove that a teacher with full knowledge about the learning dynamics of the students can teach a target concept to the entire classroom using O min {d, N } log 1 examples, where d is the ambient dimension of the problem, N is the number of learners, and is the accuracy parameter. We show the robustness of our teaching strategy when the teacher has limited knowledge of the learners' internal dynamics as provided by a noisy oracle. Further, we study the trade-off between the learners' workload and the teacher's cost in teaching the target concept. Our experiments validate our theoretical results and suggest that appropriately partitioning the classroom into homogenous groups provides a balance between these two objectives.
Differential privacy provides strong privacy guarantees simultaneously enabling useful insights from sensitive datasets. However, it provides the same level of protection for all elements (individuals and attributes) in the data. There are practical scenarios where some data attributes need more/less protection than others. In this paper, we consider dX -privacy, an instantiation of the privacy notion introduced in [6], which allows this flexibility by specifying a separate privacy budget for each pair of elements in the data domain. We describe a systematic procedure to tailor any existing differentially private mechanism that assumes a query set and a sensitivity vector as input into its dX -private variant, specifically focusing on linear queries. Our proposed meta procedure has broad applications as linear queries form the basis of a range of data analysis and machine learning algorithms, and the ability to define a more flexible privacy budget across the data domain results in improved privacy/utility tradeoff in these applications. We propose several dX -private mechanisms, and provide theoretical guarantees on the trade-off between utility and privacy. We also experimentally demonstrate the effectiveness of our procedure, by evaluating our proposed dX -private Laplace mechanism on both synthetic and real datasets using a set of randomly generated linear queries.
No abstract
I would like to express my gratitude to all the people whose help, advice, and support made significant contributions to this thesis.First, I would like to warmly acknowledge the valuable guidance and the continuous support of my primary supervisor, Bob Williamson. His wide perspective and insight were instrumental in steering my research over the course of my studies. I have learned so much from him and have thoroughly enjoyed our interactions. A most special thanks go to Xinhua Zhang, who is my co-supervisor. I am very grateful for his constant support, availability, patience, for his thoughtful comments, sharp insights, constructive critics, and discussions.I would like to thank the Australian Government for its great support for research. I was very kindly supported by both the Australian National University and Data 61 (then NICTA). I thank both for creating a fantastic environment for research. I was lucky to have helpful colleagues for technical and general discussions, among them Aditya Krishna Menon, Richard Nock, and Brendan van Rooyen.My special thanks to Tim van Erven, and Prateek Jain for hosting me kindheartedly while visiting their research groups. Thank you to both of them for stimulating technical discussions.My very heartfelt thanks go to my friends in Canberra who shared a lot of laughter, debates, and ideas. Special thanks to my long-time friend and house-mate Ajanthan, and I treasure much of our friendly conversations on various topics.I would like to express my deep gratitude to my siblings (Nathan and Manju), and my long-time friends in Sri Lanka (Aravinthan, Pathmayogan, Prakash, and Manorathan). I still feel touched by the great trust and affection they showed me.And finally, deepest felt thanks to my parents for their unconditional love, and without their input, I would not be the person I am today! vii
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.