Vikas Dhiman scite author profile

The navigation problem is classically approached in two steps: an exploration step, where mapinformation about the environment is gathered; and an exploitation step, where this information is used to navigate efficiently. Deep reinforcement learning (DRL) algorithms, alternatively, approach the problem of navigation in an end-to-end fashion. Inspired by the classical approach, we ask whether DRL algorithms are able to inherently explore, gather and exploit map-information over the course of navigation. We build upon Mirowski et al.[2017]'s work and introduce a systematic suite of experiments that vary three parameters: the agent's starting location, the agent's target location, and the maze structure. We choose evaluation metrics that explicitly measure the algorithm's ability to gather and exploit map-information. Our experiments show that when trained and tested on the same maps, the algorithm successfully gathers and exploits map-information. However, when trained and tested on different sets of maps, the algorithm fails to transfer the ability to gather and exploit map-information to unseen maps. Furthermore, we find that when the goal location is randomized and the map is kept static, the algorithm is able to gather and exploit map-information but the exploitation is far from optimal. We open-source our experimental suite in the hopes that it serves as a framework for the comparison of future algorithms and leads to the discovery of robust alternatives to classical navigation methods.

show abstract

A Continuous Occlusion Model for Road Scene Understanding

Dhiman

Tran²,

Corso

et al. 2016

View full text Add to dashboard Cite

Voxel planes: Rapid visualization and meshification of point cloud ensembles

Ryde

Dhiman

Platt

2013

View full text Add to dashboard Cite

Conversion of unorganized point clouds to surface reconstructions is increasingly required in the mobile robotics perception processing pipeline, particularly with the rapid adoption of RGB-D (color and depth) image sensors. Many contemporary methods stem from the work in the computer graphics community in order to handle the point clouds generated by tabletop scanners in a batch-like manner. The requirements for mobile robotics are different and include support for real-time processing, incremental update, localization, mapping, path planning, obstacle avoidance, ray-tracing, terrain traversability assessment, grasping/manipulation and visualization for effective human-robot interaction.We carry out a quantitative comparison of Greedy Projection and Marching cubes along with our voxel planes method. The execution speed, error, compression and visualization appearance of these are assessed. Our voxel planes approach first computes the PCA over the points inside a voxel, combining these PCA results across 2x2x2 voxel neighborhoods in a sliding window. Second, the smallest eigenvector and voxel centroid define a plane which is intersected with the voxel to reconstruct the surface patch (3-6 sided convex polygon) within that voxel. By nature of their construction these surface patches tessellate to produce a surface representation of the underlying points.In experiments on public datasets the voxel planes method is 3 times faster than marching cubes, offers 300 times better compression than Greedy Projection, 10 fold lower error than marching cubes whilst allowing incremental map updates.

show abstract

Mutual localization: Two camera relative 6-DOF pose estimation from reciprocal fiducial observation

Dhiman

Ryde

Corso

2013

View full text Add to dashboard Cite

Abstract-Concurrently estimating the 6-DOF pose of multiple cameras or robots-cooperative localization-is a core problem in contemporary robotics. Current works focus on a set of mutually observable world landmarks and often require inbuilt egomotion estimates; situations in which both assumptions are violated often arise, for example, robots with erroneous low quality odometry and IMU exploring an unknown environment. In contrast to these existing works in cooperative localization, we propose a cooperative localization method, which we call mutual localization, that uses reciprocal observations of camera-fiducials to obviate the need for egomotion estimates and mutually observable world landmarks. We formulate and solve an algebraic formulation for the pose of the two camera mutual localization setup under these assumptions. Our experiments demonstrate the capabilities of our proposal egomotion-free cooperative localization method: for example, the method achieves 2cm range and 0.7 degree accuracy at 2m sensing for 6-DOF pose. To demonstrate the applicability of the proposed work, we deploy our method on Turtlebots and we compare our results with ARToolKit [1] and Bundler [2], over which our method achieves a tenfold improvement in translation estimation accuracy.

show abstract

Heterogeneous Multi-robot Adversarial Patrolling Using Polymatrix Games

Langley

Dhiman

Christensen

2022

View full text Add to dashboard Cite

Control Barriers in Bayesian Learning of System Dynamics

Dhiman

Khojasteh

Franceschetti

et al. 2023

IEEE Trans. Automat. Contr.

View full text Add to dashboard Cite

Floyd-Warshall Reinforcement Learning: Learning from Past Experiences to Reach New Goals

Dhiman¹,

Banerjee²,

Siskind³

et al. 2018

Preprint

View full text Add to dashboard Cite

Consider mutli-goal tasks that involve static environments and dynamic goals. Examples of such tasks, such as goaldirected navigation and pick-and-place in robotics, abound. Two types of Reinforcement Learning (RL) algorithms are used for such tasks: model-free or model-based. Each of these approaches has limitations. Model-free RL struggles to transfer learned information when the goal location changes, but achieves high asymptotic accuracy in single goal tasks. Model-based RL can transfer learned information to new goal locations by retaining the explicitly learned state-dynamics, but is limited by the fact that small errors in modelling these dynamics accumulate over long-term planning. In this work, we improve upon the limitations of model-free RL in multigoal domains. We do this by adapting the Floyd-Warshall algorithm for RL and call the adaptation Floyd-Warshall RL (FWRL). The proposed algorithm learns a goal-conditioned action-value function by constraining the value of the optimal path between any two states to be greater than or equal to the value of paths via intermediary states. Experimentally, we show that FWRL is more sample-efficient and learns higher reward strategies in multi-goal tasks as compared to Q-learning, model-based RL and other relevant baselines in a tabular domain. * Several weeks after submitting the work we found a work from Kaelbling (1993), with similar main contribution is as our work. We have highlighted the minor differences in the related work section.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Vikas Dhiman

Learning Navigation Costs from Demonstration in Partially Observable Environments

A Critical Investigation of Deep Reinforcement Learning for Navigation

A Continuous Occlusion Model for Road Scene Understanding

Voxel planes: Rapid visualization and meshification of point cloud ensembles

Mutual localization: Two camera relative 6-DOF pose estimation from reciprocal fiducial observation

Heterogeneous Multi-robot Adversarial Patrolling Using Polymatrix Games

Control Barriers in Bayesian Learning of System Dynamics

Floyd-Warshall Reinforcement Learning: Learning from Past Experiences to Reach New Goals

Contact Info

Product

Resources

About