IEE Colloquium on Self Learning Robots 1996
DOI: 10.1049/ic:19960148
|View full text |Cite
|
Sign up to set email alerts
|

Neural reinforcement learning for an obstacle avoidance behavior

Abstract: Reinforcement learning (RL) offers a set of various algorithms for in-situation behavior synthesis [1]. The Qlearning [2] technique is certainly the most used of the RL methods. Multilayer perceptron implementations of the Q-learning have been proposed early [3], due to the interest of the restricted memory need and the generalization capability [4]. Self-organizing map implementation of the Q-learning is more recent [5]. We propose to study the use and discuss the interest of this implementation comparing to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2006
2006
2015
2015

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…However, when dealing with a real robot the state space is typically too large to explore. A solution to this problem is to introduce decision trees or neural networks [10,34,35] to approximate the value function (state-action function) in order to generalize from a certain number of situations.…”
Section: Simulation Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, when dealing with a real robot the state space is typically too large to explore. A solution to this problem is to introduce decision trees or neural networks [10,34,35] to approximate the value function (state-action function) in order to generalize from a certain number of situations.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…When the situation space becomes so large combined with all possible actions, an exhaustive exploration or memorization of all situation-action pairs is impossible. One solution to this problem is the generalization process through the use of artificial neural networks, as suggested in Refs [29,34,35], but problems with high-dimensional movement systems remain daunting [39]. A possible way to reduce the computational complexity of learning a control policy comes from modularizing the policy.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…Sigmoidal multilayer perceptron (MLP) neural networks are used to approximate the Q-function. In accordance with the QCON model [22], [30] Figure 2: This is a simplified code snippet that is iterated over all linkages of the parsed sentence returned by the Link Grammar [1]. The 'words' is a vector of strings that contains all words of a given linkage.…”
Section: Q-learningmentioning
confidence: 99%