Simple Artificial Neural Networks That Match Probability and Exploit and Explore When Confronting a Multiarmed Bandit

Dawson, Michael R. W.; Dupuis, Brian; Kelly, Debbie M.

doi:10.1109/tnn.2009.2025588

Cited by 19 publications

(32 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is interesting to note that, ignoring features, the "correct" wall length configuration is reinforced on 50 % of its presentations (the reinforced location and its nonreinforced rotational equivalent), while the "incorrect" configurations present in any condition are reinforced 0 % of the time, and the perceptrons' responses converge to match these probabilities. The operant perceptron has already been established to match probabilities in classical choice-behavior tasks (Dawson et al, 2009); for it to exhibit this behavior in a reorientation context reinforces Miller and Shettleworth's (2007) conceptualization of reorientation as an operant task.…”

Section: Connection Weightsmentioning

confidence: 83%

“…While we could extend the operant perceptron to see whether it fits the data from some of the novel tasks (i.e., regular octagons; Newcombe et al, 2010) in a manner similar to a more standard perceptron (Dawson, Kelly, et al, 2010), the synthetic approach would be to generate totally new predictions inspired by our findings. In this case, we might try nonuniform octagons (an arena type not yet investigated), or we might note other successes of the operant perceptron altogether, such as superconditioning (Dupuis & Dawson, in press) or probability matching (Dawson et al, 2009), and branch out beyond reorientation into completely new paradigms. Lewandowsky (1993) observed that computer modeling had its benefits, if done with care.…”

Section: Discussionmentioning

confidence: 99%

“…In effect, the perceptron will choose whether or not to visit a location with a probability based on how attractive the cues at that location are, and it will learn only from locations it chooses to visit. This algorithm is detailed at length in Dupuis and Dawson (in press) and in a more abbreviated form in Dawson et al (2009).…”

Section: Training Methodsmentioning

confidence: 99%

“…For locations that are reinforced, the perceptron is trained to turn on (output activity=1); for locations that are not reinforced, the perceptron is trained to turn off (output activity=0). Because, during learning, perceptron activity falls in the continuous range between 0 and 1, at any moment in time, the perceptron's output can be interpreted as its estimation of the conditional probability of reinforcement at a location, given the cues at that location (Dawson et al, 2009;Dupuis & Dawson, in press). …”

Section: Defining the Task: Responsementioning

confidence: 99%

“…A perceptron trained with a standard learning algorithm has been shown to generate most of the interesting regularities found in reorientation task behavior (Dawson, Kelly, et al, 2010). An operant perceptron model of reorientation, which uses a more psychologically plausible learning algorithm (in which the perceptron has a chance of not visiting a location, based on the total associative strength at that location, and connection weights are adjusted only when the perceptron chooses to investigate a location), has also been proposed and has shown some promising results (Dawson, Dupuis, Spetch, & Kelly, 2009;Dupuis & Dawson, in press). The purpose of the present article is to illustrate how an operant perceptron can be used to explore reorientation by observing the model's behavior when novel reorientation paradigms are simulated.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Get out of the corner: Inhibition and the effect of location type and number on perceptron and human reorientation

Dupuis

Dawson

2013

Learn Behav

View full text Add to dashboard Cite

Spatial learning and navigation have frequently been investigated using a reorientation task paradigm (Cheng, Cognition, 23(2), 149-78, 1986). However, implementing this task typically involves making tacit assumptions about the nature of spatial information. This has important theoretical consequences: Theories of reorientation typically focus on angles at corners as geometric cues and ignore information present at noncorner locations. We present a neural network model of reorientation that challenges these assumptions and use this model to generate predictions in a novel variant of the reorientation task. We test these predictions against human behavior in a virtual environment. Networks and humans alike exhibit reorien tation behavior even when goal locations are not present at corners. Our simulated and our experimental results suggest that angles are processed in a manner more similar to features, acting as a focal point for reorientation, and that the mechanisms governing reorientation behavior may be inhibitory rather than excitatory.

show abstract

Section: Connection Weightsmentioning

confidence: 83%