2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019
DOI: 10.1109/iros40897.2019.8968010
|View full text |Cite
|
Sign up to set email alerts
|

Trajectory Optimization for Unknown Constrained Systems using Reinforcement Learning

Abstract: In this paper, we propose a reinforcement learning-based algorithm for trajectory optimization for constrained dynamical systems. This problem is motivated by the fact that for most robotic systems, the dynamics may not always be known. Generating smooth, dynamically feasible trajectories could be difficult for such systems. Using samplingbased algorithms for motion planning may result in trajectories that are prone to undesirable control jumps. However, they can usually provide a good reference trajectory whi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 25 publications
(20 citation statements)
references
References 14 publications
0
15
0
Order By: Relevance
“…Ota et al proposed that a 6-DoF manipulator arm trained with a good reference trajectory to quickly track a designed trajectory in configuration space. 122 In a recent study, an improved Q-learning algorithm is exploited to form a reward and penalty mechanism, 123 which effectively tackles the problem of robotic time-optimal route tracking with prior knowledge. The prior knowledge of leg trajectories was embedded into the action space during safe exploration with only less data collection to achieve walking on a quadruped robot.…”
Section: Trajectory and Route Trackingmentioning
confidence: 99%
“…Ota et al proposed that a 6-DoF manipulator arm trained with a good reference trajectory to quickly track a designed trajectory in configuration space. 122 In a recent study, an improved Q-learning algorithm is exploited to form a reward and penalty mechanism, 123 which effectively tackles the problem of robotic time-optimal route tracking with prior knowledge. The prior knowledge of leg trajectories was embedded into the action space during safe exploration with only less data collection to achieve walking on a quadruped robot.…”
Section: Trajectory and Route Trackingmentioning
confidence: 99%
“…Mode-covering can be achieved via forward KL-Divergence, such as behavior cloning. Hence, we incorporate Behavior Cloning Loss [21,34] into our policy updates to encourage active exploration at the beginning stages of training. We also add Q-Filter to avoid sub-optimal actions being chosen.…”
Section: Modification For Efficient Trainingmentioning
confidence: 99%
“…The combination of reference paths and RL has been widely researched [9,10,11,12]. In [9,11], Probabilistic Roadmaps (PRM) [2] are used to find reference paths, and RL is used for point-to-point navigation as a local planner for PRM.…”
Section: Related Workmentioning
confidence: 99%
“…In our proposed work, this is achieved as the path encodes the goal and automatically tries to avoid obstacles in the environment, too. The closest work to ours is [12,18]. [12] learns an RL agent that optimizes trajectory for a 6-DoF manipulator arm.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation