Abstract-We present a policy search method for learning complex feedback control policies that map from highdimensional sensory inputs to motor torques, for manipulation tasks with discontinuous contact dynamics. We build on a prior technique called guided policy search (GPS), which iteratively optimizes a set of local policies for specific instances of a task, and uses these to train a complex, high-dimensional global policy that generalizes across task instances. We extend GPS in the following ways: (1) we propose the use of a model-free local optimizer based on path integral stochastic optimal control (PI 2 ), which enables us to learn local policies for tasks with highly discontinuous contact dynamics; and (2) we enable GPS to train on a new set of task instances in every iteration by using on-policy sampling: this increases the diversity of the instances that the policy is trained on, and is crucial for achieving good generalization. We show that these contributions enable us to learn deep neural network policies that can directly perform torque control from visual input. We validate the method on a challenging door opening task and a pick-and-place task, and we demonstrate that our approach substantially outperforms the prior LQR-based local policy optimizer on these tasks. Furthermore, we show that on-policy sampling significantly increases the generalization ability of these policies.
Abstract-In principle, reinforcement learning and policy search methods can enable robots to learn highly complex and general skills that may allow them to function amid the complexity and diversity of the real world. However, training a policy that generalizes well across a wide range of realworld conditions requires far greater quantity and diversity of experience than is practical to collect with a single robot. Fortunately, it is possible for multiple robots to share their experience with one another, and thereby, learn a policy collectively. In this work, we explore distributed and asynchronous policy learning as a means to achieve generalization and improved training times on challenging, real-world manipulation tasks. We propose a distributed and asynchronous version of Guided Policy Search and use it to demonstrate collective policy learning on a vision-based door opening task using four robots. We show that it achieves better generalization, utilization, and training times than the single robot alternative.
Introduction: More than 424 million adults have diabetes mellitus (DM). This number is expected to increase to 626 million by 2045. The majority (90-95%) of people with DM has type 2-diabetes (T2DM). The continued prevalence of DM and associated complications has prompted investigators to find new therapies. One of the most recent additions to the antidiabetic armamentarium are inhibitors of sodium-glucose co-transporters 1 and 2 (SGLT1, SGLT2). Areas covered: The authors review the status of SGLT2 inhibitors for the treatment of T2DM and place an emphasis on those agents in early phase clinical trials. Data and information were retrieved from American Diabetes Association, Diabetes UK, ClinicalTrials.gov, PubMed, and Scopus websites. The keywords used in the search were T2DM, SGLT1, SGLT2 and clinical trials. Expert opinion: The benefits of SGLT inhibitors include reductions in serum glycated haemoglobin (HbA1c), body weight, blood pressure and cardiovascular and renal events. However, SGLT inhibitors increase the risk of genitourinary tract infections, diabetic ketoacidosis and bone fractures. The development of SGLT inhibitors with fewer side effects and as combination therapies are the key to maximizing the therapeutic effects of this important class of anti-diabetic drug.
A novel model of image segmentation based on watershed method is proposed in this paper. To prevent the oversegmentation of traditional watershed, our proposed algorithm has five stages. Firstly, the morphological reconstruction is applied to smooth the flat area and preserve the edge of the image. Secondly, multiscale morphological gradient is used to avoid the thickening and merging of the edges. Thirdly, for contrast enhancement, the top/bottom hat transformation is used. Fourthly, the morphological gradient of an image is modified by imposing regional minima at the location of both the internal and the external markers. Finally, a weighted function is used to combine the top/bottom hat transformation algorithm and the markers algorithm to get the new algorithm. The experimental results show the superiority of the new algorithm in terms of suppression over-segmentation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.