2006
DOI: 10.1007/11552246_35
|View full text |Cite
|
Sign up to set email alerts
|

Autonomous Inverted Helicopter Flight via Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
324
0
2

Year Published

2007
2007
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 427 publications
(327 citation statements)
references
References 5 publications
1
324
0
2
Order By: Relevance
“…Several different definitions of the MDP exist. In the remainder of this thesis, we will use the following one from [44]. An MDP can be represented as a tuple < S, A, D, P (s t , a t ), γ, ρ(s t , a t ) >, consisting of:…”
Section: The Markov Decision Processmentioning
confidence: 99%
See 2 more Smart Citations
“…Several different definitions of the MDP exist. In the remainder of this thesis, we will use the following one from [44]. An MDP can be represented as a tuple < S, A, D, P (s t , a t ), γ, ρ(s t , a t ) >, consisting of:…”
Section: The Markov Decision Processmentioning
confidence: 99%
“…After the off-line training, they expose the controller on-line to more demanding flight maneuvers. Ng uses a actor-only method to fly an autonomous model helicopter, by estimating the cost value from Monte-Carlo roll-outs [44] (see subsection 2.5.3).…”
Section: Temporal-difference Learning: Actors Critics Actor-criticsmentioning
confidence: 99%
See 1 more Smart Citation
“…Since the recorded data maps directly to the learner platform, this demonstration technique best minimizes the introduction of correspondence issues into an LfD system. Examples of successful teleoperated LfD systems include both real [10] and simulated [7] robot applications.…”
Section: Learning From Demonstrationmentioning
confidence: 99%
“…RL has been used to control a model helicopter than can hover while inverted in air [7]. Other ML techniques have been applied to directly control simulated bipedal robots: in [8] a central pattern generator was used for rhythm generation in the hips and knees of a simulated bipedal robot, and a dynamics controller was used to control the ankles of robot.…”
Section: Related Workmentioning
confidence: 99%