Ian Char scite author profile

Biometric authentication relies on an individual's inner characteristics and traits. We propose an active authentication system on a mobile device that relies on two biometric modalities: 3D gestures and face recognition. The novelty of our approach is to combine 3D gesture and face recognition in a non-intrusive and unconstrained environment; the active authentication system is running in the background while the user is performing his/her main task.

show abstract

BATS: Best Action Trajectory Stitching

Char¹,

Mehta²,

Villaflor³

et al. 2022

Preprint

View full text Add to dashboard Cite

The problem of offline reinforcement learning focuses on learning a good policy from a log of environment interactions. Past efforts for developing algorithms in this area have revolved around introducing constraints to online reinforcement learning algorithms to ensure the actions of the learned policy are constrained to the logged data. In this work, we explore an alternative approach by planning on the fixed dataset directly. Specifically, we introduce an algorithm which forms a tabular Markov Decision Process (MDP) over the logged data by adding new transitions to the dataset. We do this by using learned dynamics models to plan short trajectories between states. Since exact value iteration can be performed on this constructed MDP, it becomes easy to identify which trajectories are advantageous to add to the MDP. Crucially, since most transitions in this MDP come from the logged data, trajectories from the MDP can be rolled out for long periods with confidence. We prove that this property allows one to make upper and lower bounds on the value function up to appropriate distance metrics. Finally, we demonstrate empirically how algorithms that uniformly constrain the learned policy to the entire dataset can result in unwanted behavior, and we show an example in which simply behavior cloning the optimal policy of the MDP created by our algorithm avoids this problem.

show abstract

Near-optimal Policy Identification in Active Reinforcement Learning

Li¹,

Mehta²,

Kirschner³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ian Char

Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification

Neural Dynamical Systems: Balancing Structure and Flexibility in Physical Prediction

Toward a non-intrusive, physio- behavioral biometric for smartphones

BATS: Best Action Trajectory Stitching

Near-optimal Policy Identification in Active Reinforcement Learning

Contact Info

Product

Resources

About