We propose a novel approach to addressing two fundamental challenges in Model-based Reinforcement Learning (MBRL): the computational expense of repeatedly finding a good policy in the learned model, and the objective mismatch between model fitting and policy computation. Our "lazy" method leverages a novel unified objective, Performance Difference via Advantage in Model, to capture the performance difference between the learned policy and expert policy under the true dynamics. This objective demonstrates that optimizing the expected policy advantage in the learned model under an exploration distribution is sufficient for policy computation, resulting in a significant boost in computational efficiency compared to traditional planning methods. Additionally, the unified objective uses a value moment matching term for model fitting, which is aligned with the model's usage during policy computation. We present two no-regret algorithms to optimize the proposed objective, and demonstrate their statistical and computational gains compared to existing MBRL methods through simulated benchmarks.
In this paper, we present a Model Predictive Control (MPC) framework based on path velocity decomposition paradigm for autonomous driving. The optimization underlying the MPC has a two layer structure wherein first, an appropriate path is computed for the vehicle followed by the computation of optimal forward velocity along it. The very nature of the proposed path velocity decomposition allows for seamless compatibility between the two layers of the optimization.A key feature of the proposed work is that it offloads most of the responsibility of collision avoidance to velocity optimization layer for which computationally efficient formulations can be derived. In particular, we extend our previously developed concept of time scaled collision cone (TSCC) constraints and formulate the forward velocity optimization layer as a convex quadratic programming problem. We perform validation on autonomous driving scenarios wherein proposed MPC repeatedly solves both the optimization layers in receding horizon manner to compute lane change, overtaking and merging maneuvers among multiple dynamic obstacles.
ObjectivesThe impending and increasing prevalence of diabetic retinopathy (DR) in India has necessitated a need for affordable and valid community outreach screening programme for DR, especially in rural and far to reach indigenous local communities. The present study is a pilot study aimed to compare non-mydriatic fundus photography with indirect ophthalmoscopy for its utilisation as a feasible and logistically convenient screening modality for DR in an older age, rural, tribal population in Western India.Design and settingThis community-based, cross-sectional, prospective population study was a part of a module using Rapid Assessment of Avoidable Blindness and DR methodology in 8340 sampled participants with ≥50 years age. In this study, the diabetics identified were screened for DR using two methods: non-mydriatic fundus photography on the field by trained professionals, that were then graded by a retina specialist at the base hospital and indirect ophthalmoscopy by expert ophthalmologists in the field with masking of each other’s findings for its utility and comparison.ResultsThe prevalence of DR, sight threatening DR and maculopathy using indirect ophthalmoscopy was found to be 12.1%, 2.1% and 6.6%, respectively. A fair agreement (κ=0.48 for DR and 0.59 for maculopathy) was observed between both the detection methods. The sensitivity and specificity of fundus photographic evaluation compared with indirect ophthalmoscopy were found to be 54.8% and 92.1% (for DR), 60.7% and 90.8% (for any DR) and 84.2% and 94.8% (for only maculopathy), respectively.ConclusionNon-mydriatic fundus photography has the potential to identify DR (any retinopathy or maculopathy) in community settings in Indian population. Its utility as an affordable and logistically convenient cum practical modality is demonstrable. The sensitivity of this screening modality can be further increased by investing in better resolution cameras, capturing quality images and training and validation of imagers.Trial registration numberCTRI/2020/01/023025; Clinical Trial Registry, India (CTRI).
Models used in modern planning problems to simulate outcomes of real world action executions are becoming increasingly complex, ranging from simulators that do physicsbased reasoning to precomputed analytical motion primitives. However, robots operating in the real world often face situations not modeled by these models before execution. This imperfect modeling can lead to highly suboptimal or even incomplete behavior during execution. In this paper, we propose an approach for interleaving planning and execution that adapts online using real world execution and accounts for any discrepancies in dynamics during planning, without requiring updates to the dynamics of the model. This is achieved by biasing the planner away from transitions whose dynamics are discovered to be inaccurately modeled, thereby leading to robot behavior that tries to complete the task despite having an inaccurate model. We provide provable guarantees on the completeness and efficiency of the proposed planning and execution framework under specific assumptions on the model, for both small and large state spaces. Our approach CMAX is shown to be efficient empirically in simulated robotic tasks including 4D planar pushing, and in real robotic experiments using PR2 involving a 3D pick-and-place task where the mass of the object is incorrectly modeled, and a 7D arm planning task where one of the joints is not operational leading to discrepancy in dynamics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.