Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning

Deisenroth, Marc Peter; Rasmussen, Carl Edward; Fox, Dieter

doi:10.15607/rss.2011.vii.008

Cited by 199 publications

(203 citation statements)

References 18 publications

Supporting

Mentioning

182

Contrasting

Order By: Relevance

“…Continuous RL has been used for reaching and grasping objects [30][31][32], as well as for the transportation of grasped objects [12,[32][33][34][35]. These methods are driven by feedback from tactile sensing [12,35], Cartesian-and joint-space coordinates [33,34], or both [32]. However, these approaches have not been applied at in-hand manipulation, i.e., changing the object's pose with respect to the hand.…”

Section: Reinforcement Learning For Manipulationmentioning

confidence: 99%

Learning robot in-hand manipulation with tactile features

Hoof

Hermans

Neumann

et al. 2015

2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids)

127

View full text Add to dashboard Cite

Abstract-Dexterous manipulation enables repositioning of objects and tools within a robot's hand. When applying dexterous manipulation to unknown objects, exact object models are not available. Instead of relying on models, compliance and tactile feedback can be exploited to adapt to unknown objects. However, compliant hands and tactile sensors add complexity and are themselves difficult to model. Hence, we propose acquiring in-hand manipulation skills through reinforcement learning, which does not require analytic dynamics or kinematics models. In this paper, we show that this approach successfully acquires a tactile manipulation skill using a passively compliant hand. Additionally, we show that the learned tactile skill generalizes to novel objects.

show abstract

Section: Reinforcement Learning For Manipulationmentioning

confidence: 99%

Learning robot in-hand manipulation with tactile features

Hoof

Hermans

Neumann

et al. 2015

2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids)

127

View full text Add to dashboard Cite

show abstract

“…However, often many trials are necessary to obtain a good controller. Model-based RL methods potentially use the data more efficient, but it is well known that model bias can strongly degrade the learning performance [4]. Here, a controller might succeed in simulation but fails when applied to the real system, if the model does not describe the complete system dynamics.…”

Section: Approaches In Reinforcement Learningmentioning

confidence: 99%

“…Instead, we include the goal position g as input to the controller, i.e. u = π θ (s, g), as described in [11]. Now, one joint controller is learned for all goal positions.…”

Section: Multiple Start and Goal Statesmentioning

confidence: 99%

See 1 more Smart Citation

Learning Throttle Valve Control Using Policy Search

Bischoff

Nguyen-Tuong

Koller

et al. 2013

Advanced Information Systems Engineering

View full text Add to dashboard Cite

Abstract. The throttle valve is a technical device used for regulating a fluid or a gas flow. Throttle valve control is a challenging task, due to its complex dynamics and demanding constraints for the controller. Using state-of-the-art throttle valve control, such as model-free PID controllers, time-consuming and manual adjusting of the controller is necessary. In this paper, we investigate how reinforcement learning (RL) can help to alleviate the effort of manual controller design by automatically learning a control policy from experiences. In order to obtain a valid control policy for the throttle valve, several constraints need to be addressed, such as no-overshoot. Furthermore, the learned controller must be able to follow given desired trajectories, while moving the valve from any start to any goal position and, thus, multi-targets policy learning needs to be considered for RL. In this study, we employ a policy search RL approach, Pilco [2], to learn a throttle valve control policy. We adapt the Pilco algorithm, while taking into account the practical requirements and constraints for the controller. For evaluation, we employ the resulting algorithm to solve several control tasks in simulation, as well as on a physical throttle valve system. The results show that policy search RL is able to learn a consistent control policy for complex, real-world systems.

show abstract

“…Reducing overall cost is critical to the commercialization of robotic manipulator technologies [1], [2], especially those envisioned for use in unstructured environments such as households, office spaces and hazardous environments [3], [4]. In order for humanoid robots such as the PR2 [2], NAO [5], and others to reach their target market in households and workplaces, the cost of the robot must decrease.…”

Section: Introductionmentioning

confidence: 99%

Manipulator state estimation with low cost accelerometers and gyroscopes

Roan

Deshpande

Wang

et al. 2012

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems

View full text Add to dashboard Cite

Abstract-Robot manipulator designs are increasingly focused on low cost approaches, especially those envisioned for use in unstructured environments such as households, office spaces and hazardous environments. The cost of angular sensors varies based on the precision offered. For tasks in these environments, millimeter order manipulation errors are unlikely to cause drastic reduction in performance. In this paper, estimates the joint angles of a manipulator using low cost triaxial accelerometers by taking the difference between consecutive acceleration vectors. The accelerometer-based angle is compensated with a uniaxial gyroscope using a complementary filter to give robust measurements. Three compensation strategies are compared: complementary filter, time varying complementary filter, and extended Kalman filter. This sensor setup can also accurately track the joint angle even when the joint axis is parallel to gravity and the accelerometer data does not provide useful information. In order to analyze this strategy, accelerometers and gyroscopes were mounted on one arm of a PR2 robot. The arm was manually moved smoothly through different trajectories in its workspace while the joint angle readings from the on-board optical encoders were compared against the joint angle estimates from the accelerometers and gyroscopes. The low cost angle estimation strategy has a mean error 1.3• over the three joints estimated, resulting in mean end effector position errors of 6.1 mm or less. This system provides an effective angular measurement as an alternative to high precision encoders in low cost manipulators and as redundant measurements for safety in other manipulators.

show abstract

Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning

Cited by 199 publications

References 18 publications

Learning robot in-hand manipulation with tactile features

Learning robot in-hand manipulation with tactile features

Learning Throttle Valve Control Using Policy Search

Manipulator state estimation with low cost accelerometers and gyroscopes

Contact Info

Product

Resources

About