John J. Garcia scite author profile

In this paper, we consider the important problem of safe exploration in reinforcement learning. While reinforcement learning is well-suited to domains with complex transition dynamics and high-dimensional state-action spaces, an additional challenge is posed by the need for safe and efficient exploration. Traditional exploration techniques are not particularly useful for solving dangerous tasks, where the trial and error process may lead to the selection of actions whose execution in some states may result in damage to the learning system (or any other system). Consequently, when an agent begins an interaction with a dangerous and high-dimensional state-action space, an important question arises; namely, that of how to avoid (or at least minimize) damage caused by the exploration of the state-action space. We introduce the PI-SRL algorithm which safely improves suboptimal albeit robust behaviors for continuous state and action control tasks and which efficiently learns from the experience gained from the environment. We evaluate the proposed method in four complex tasks: automatic car parking, pole-balancing, helicopter hovering, and business management

show abstract

Integrating the users in the design of a robot for making Comprehensive Geriatric Assessments (CGA) to elderly people in care centers

Ting

Voilmy

Iglesias

et al. 2017

View full text Add to dashboard Cite

Towards a robust robotic assistant for Comprehensive Geriatric Assessment procedures: updating the CLARC system

Martínez¹,

Romero-Garcés²,

Suárez

et al. 2018

View full text Add to dashboard Cite

Socially assistive robots appear as a powerful tool in the upcoming silver society. They are among the technologies for Assisted Living, offering a natural interface with smart environments, while helping people through social interaction. The CLARC project aims to develop a socially assistive robot to help clinicians perform Comprehensive Geriatric Assessment (CGA) procedures. This robot autonomously drives some tests and processes, saving time for the clinician to perform more added-value activities, like designing care plans. The project has recently finished its first two phases, and now it faces its final one. This paper details the current prototype of the CLARC system and the main results collected so far during its evaluation. Then, it describes the updates and modifications planned for the next year, in which long term extensive evaluations will be conducted to validate its acceptability and utility.

show abstract

Perceptions or Actions? Grounding How Agents Interact Within a Software Architecture for Cognitive Robotics

et al. 2019

View full text Add to dashboard Cite

One of the aims of cognitive robotics is to endow robots with the ability to plan solutions for complex goals and then to enact those plans. Additionally, robots should react properly upon encountering unexpected changes in their environment that are not part of their planned course of actions. This requires a close coupling between deliberative and reactive control flows. From the perspective of robotics, this coupling generally entails a tightly integrated perceptuomotor system, which is then loosely connected to some specific form of deliberative system such as a planner. From the high-level perspective of automated planning, the emphasis is on a highly functional system that, taken to its extreme, calls perceptual and motor modules as services when required. This paper proposes to join the perceptual and acting perspectives via a unique representation where the responses of all software modules in the architecture are generalized using the same set of tokens. The proposed representation integrates symbolic and metric information. The proposed approach has been successfully tested in CLARC, a robot that performs Comprehensive Geriatric Assessments of elderly patients. The robot was favourably appraised in a survey conducted to assess its behaviour. For instance,

show abstract

Distribution of retinoids in different compartments of the posterior segment of the rabbit eye

Lai

Tsin

Lam

et al. 1985

Brain Research Bulletin

View full text Add to dashboard Cite

Safe reinforcement learning in high-risk tasks through policy improvement

Garcia

Rebollo

2011

View full text Add to dashboard Cite

Reinforcement Learning (RL) methods are widely used for dynamic control tasks. In many cases, these are high risk tasks where the trial and error process may select actions which execution from unsafe states can be catastrophic. In addition, many of these tasks have continuous state and action spaces, making the learning problem harder and unapproachable with conventional RL algorithms. So, when the agent begins to interact with a risky and large state-action space environment, an important question arises: how can we avoid that the exploration of the state-action space causes damages in the learning (or other) systems. In this paper, we define the concept of risk and address the problem of safe exploration in the context of RL. Our notion of safety is concerned with states that can lead to damage. Moreover, we introduce an algorithm that safely improves suboptimal but robust behaviors for continuous state and action control tasks, and that learns efficiently from the experience gathered from the environment. We report experimental results using the helicopter hovering task from the RL Competition.

show abstract

A taxonomy for similarity metrics between Markov decision processes

2022

View full text Add to dashboard Cite

Although the notion of task similarity is potentially interesting in a wide range of areas such as curriculum learning or automated planning, it has mostly been tied to transfer learning. Transfer is based on the idea of reusing the knowledge acquired in the learning of a set of source tasks to a new learning process in a target task, assuming that the target and source tasks are close enough. In recent years, transfer learning has succeeded in making reinforcement learning (RL) algorithms more efficient (e.g., by reducing the number of samples needed to achieve (near-)optimal performance). Transfer in RL is based on the core concept of similarity: whenever the tasks are similar, the transferred knowledge can be reused to solve the target task and significantly improve the learning performance. Therefore, the selection of good metrics to measure these similarities is a critical aspect when building transfer RL algorithms, especially when this knowledge is transferred from simulation to the real world. In the literature, there are many metrics to measure the similarity between MDPs, hence, many definitions of similarity or its complement distance have been considered. In this paper, we propose a categorization of these metrics and analyze the definitions of similarity proposed so far, taking into account such categorization. We also follow this taxonomy to survey the existing literature, as well as suggesting future directions for the construction of new metrics.

show abstract

Second-generation endoscopic ultrasound elastography in the differential diagnosis of solid pancreatic masses: Pancreatic cancer vs. inflammatory mass in chronic pancreatitis

Garcia

Noia

Castro

et al. 2009

Rev. esp. enferm. dig.

View full text Add to dashboard Cite

Endoscopic ultrasonography (EUS) is considered one of the most accurate methods for the diagnosis and staging of pancreatic tumors. EUS-guided fine-needle aspiration (FNA) allows to increase the diagnostic accuracy of EUS in this setting; however, it is technically demanding (a pathologist is also essential) and is furthermore associated with small, but not insignificant morbidity. EUS pancreatic elastography, by analyzing tissue stiffness, arises as a new and very useful tool for the differential diagnosis of solid pancreatic masses. Elastography provides specific patterns supporting the benign or malignant nature of the disease. However, there is a handicap related to the subjective interpretation of images. Second-generation elastography has been recently developed, and allows a quantitative analysis of tissue stiffness. It is based on the determination of a strain ratio (obtained after comparing the strain value of the mass to a strain value from a control area in the region under study). We present two cases reflecting the usefulness of second-generation elastography in the differential diagnosis between pancreatic adenocarcinoma and an inflammatory mass in the context of chronic pancreatitis. We found significant differences between both masses in the strain ratio values (25.46% in the pancreatic adenocarcinoma vs. 2.35% in the inflammatory mass). Second-generation elastography is a very useful tool for the differential diagnosis of solid pancreatic masses.Key words: Endoscopic ultrasound. Elastography. Second-generation Elastography. Pancreatic tumors. INTRODUCTIONEndoscopic ultrasonography (EUS) has become a basic tool for the study of pancreatic diseases, and is considered one of the most accurate methods for the diagnosis and staging of both chronic inflammatory and neoplastic pancreatic diseases (1,2). However, differentiation between pancreatic cancer and focal pancreatitis remains a challenge. EUS can guide fine-needle aspiration (EUS-FNA) for the collection of cytological samples from pancreatic lesions with a very high overall diagnostic accuracy (3-7). EUS-FNA may be, however, technically demanding, and multiple puncturing of pancreatic lesions may be needed to obtain adequate material for cytological or microhistological evaluation. EUS-FNA of the pancreas, despite being considered very safe, is furthermore associated with a small, but not insignificant morbidity (8,9).Elastography is a method for the real-time evaluation of tissue stiffness, which has been used for the analysis of superficial organ lesions such as those of the breast (10-13). Images obtained by elastography represent tissue elasticity, which may reflect histopathological differences (14). The association of this technology with EUS has implied a significant advance in the management of pancreatic diseases, mainly in the differential diagnosis of pancreatic tumors (15). A new advance has been recently developed -secondgeneration elastography. This technique allows not only a qualitative elastographic analysis, but also a...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

John J. Garcia

Safe Exploration of State and Action Spaces in Reinforcement Learning

Integrating the users in the design of a robot for making Comprehensive Geriatric Assessments (CGA) to elderly people in care centers

Towards a robust robotic assistant for Comprehensive Geriatric Assessment procedures: updating the CLARC system

Perceptions or Actions? Grounding How Agents Interact Within a Software Architecture for Cognitive Robotics

Distribution of retinoids in different compartments of the posterior segment of the rabbit eye

Safe reinforcement learning in high-risk tasks through policy improvement

A taxonomy for similarity metrics between Markov decision processes

Second-generation endoscopic ultrasound elastography in the differential diagnosis of solid pancreatic masses: Pancreatic cancer vs. inflammatory mass in chronic pancreatitis

Contact Info

Product

Resources

About