In this paper, we consider the important problem of safe exploration in reinforcement learning. While reinforcement learning is well-suited to domains with complex transition dynamics and high-dimensional state-action spaces, an additional challenge is posed by the need for safe and efficient exploration. Traditional exploration techniques are not particularly useful for solving dangerous tasks, where the trial and error process may lead to the selection of actions whose execution in some states may result in damage to the learning system (or any other system). Consequently, when an agent begins an interaction with a dangerous and high-dimensional state-action space, an important question arises; namely, that of how to avoid (or at least minimize) damage caused by the exploration of the state-action space. We introduce the PI-SRL algorithm which safely improves suboptimal albeit robust behaviors for continuous state and action control tasks and which efficiently learns from the experience gained from the environment. We evaluate the proposed method in four complex tasks: automatic car parking, pole-balancing, helicopter hovering, and business management
Socially assistive robots appear as a powerful tool in the upcoming silver society. They are among the technologies for Assisted Living, offering a natural interface with smart environments, while helping people through social interaction. The CLARC project aims to develop a socially assistive robot to help clinicians perform Comprehensive Geriatric Assessment (CGA) procedures. This robot autonomously drives some tests and processes, saving time for the clinician to perform more added-value activities, like designing care plans. The project has recently finished its first two phases, and now it faces its final one. This paper details the current prototype of the CLARC system and the main results collected so far during its evaluation. Then, it describes the updates and modifications planned for the next year, in which long term extensive evaluations will be conducted to validate its acceptability and utility.
One of the aims of cognitive robotics is to endow robots with the ability to plan solutions for complex goals and then to enact those plans. Additionally, robots should react properly upon encountering unexpected changes in their environment that are not part of their planned course of actions. This requires a close coupling between deliberative and reactive control flows. From the perspective of robotics, this coupling generally entails a tightly integrated perceptuomotor system, which is then loosely connected to some specific form of deliberative system such as a planner. From the high-level perspective of automated planning, the emphasis is on a highly functional system that, taken to its extreme, calls perceptual and motor modules as services when required. This paper proposes to join the perceptual and acting perspectives via a unique representation where the responses of all software modules in the architecture are generalized using the same set of tokens. The proposed representation integrates symbolic and metric information. The proposed approach has been successfully tested in CLARC, a robot that performs Comprehensive Geriatric Assessments of elderly patients. The robot was favourably appraised in a survey conducted to assess its behaviour. For instance,
Reinforcement Learning (RL) methods are widely used for dynamic control tasks. In many cases, these are high risk tasks where the trial and error process may select actions which execution from unsafe states can be catastrophic. In addition, many of these tasks have continuous state and action spaces, making the learning problem harder and unapproachable with conventional RL algorithms. So, when the agent begins to interact with a risky and large state-action space environment, an important question arises: how can we avoid that the exploration of the state-action space causes damages in the learning (or other) systems. In this paper, we define the concept of risk and address the problem of safe exploration in the context of RL. Our notion of safety is concerned with states that can lead to damage. Moreover, we introduce an algorithm that safely improves suboptimal but robust behaviors for continuous state and action control tasks, and that learns efficiently from the experience gathered from the environment. We report experimental results using the helicopter hovering task from the RL Competition.
Although the notion of task similarity is potentially interesting in a wide range of areas such as curriculum learning or automated planning, it has mostly been tied to transfer learning. Transfer is based on the idea of reusing the knowledge acquired in the learning of a set of source tasks to a new learning process in a target task, assuming that the target and source tasks are close enough. In recent years, transfer learning has succeeded in making reinforcement learning (RL) algorithms more efficient (e.g., by reducing the number of samples needed to achieve (near-)optimal performance). Transfer in RL is based on the core concept of similarity: whenever the tasks are similar, the transferred knowledge can be reused to solve the target task and significantly improve the learning performance. Therefore, the selection of good metrics to measure these similarities is a critical aspect when building transfer RL algorithms, especially when this knowledge is transferred from simulation to the real world. In the literature, there are many metrics to measure the similarity between MDPs, hence, many definitions of similarity or its complement distance have been considered. In this paper, we propose a categorization of these metrics and analyze the definitions of similarity proposed so far, taking into account such categorization. We also follow this taxonomy to survey the existing literature, as well as suggesting future directions for the construction of new metrics.
Endoscopic ultrasonography (EUS) is considered one of the most accurate methods for the diagnosis and staging of pancreatic tumors. EUS-guided fine-needle aspiration (FNA) allows to increase the diagnostic accuracy of EUS in this setting; however, it is technically demanding (a pathologist is also essential) and is furthermore associated with small, but not insignificant morbidity. EUS pancreatic elastography, by analyzing tissue stiffness, arises as a new and very useful tool for the differential diagnosis of solid pancreatic masses. Elastography provides specific patterns supporting the benign or malignant nature of the disease. However, there is a handicap related to the subjective interpretation of images. Second-generation elastography has been recently developed, and allows a quantitative analysis of tissue stiffness. It is based on the determination of a strain ratio (obtained after comparing the strain value of the mass to a strain value from a control area in the region under study). We present two cases reflecting the usefulness of second-generation elastography in the differential diagnosis between pancreatic adenocarcinoma and an inflammatory mass in the context of chronic pancreatitis. We found significant differences between both masses in the strain ratio values (25.46% in the pancreatic adenocarcinoma vs. 2.35% in the inflammatory mass). Second-generation elastography is a very useful tool for the differential diagnosis of solid pancreatic masses.Key words: Endoscopic ultrasound. Elastography. Second-generation Elastography. Pancreatic tumors. INTRODUCTIONEndoscopic ultrasonography (EUS) has become a basic tool for the study of pancreatic diseases, and is considered one of the most accurate methods for the diagnosis and staging of both chronic inflammatory and neoplastic pancreatic diseases (1,2). However, differentiation between pancreatic cancer and focal pancreatitis remains a challenge. EUS can guide fine-needle aspiration (EUS-FNA) for the collection of cytological samples from pancreatic lesions with a very high overall diagnostic accuracy (3-7). EUS-FNA may be, however, technically demanding, and multiple puncturing of pancreatic lesions may be needed to obtain adequate material for cytological or microhistological evaluation. EUS-FNA of the pancreas, despite being considered very safe, is furthermore associated with a small, but not insignificant morbidity (8,9).Elastography is a method for the real-time evaluation of tissue stiffness, which has been used for the analysis of superficial organ lesions such as those of the breast (10-13). Images obtained by elastography represent tissue elasticity, which may reflect histopathological differences (14). The association of this technology with EUS has implied a significant advance in the management of pancreatic diseases, mainly in the differential diagnosis of pancreatic tumors (15). A new advance has been recently developed -secondgeneration elastography. This technique allows not only a qualitative elastographic analysis, but also a...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.