2018
DOI: 10.1371/journal.pcbi.1006621
|View full text |Cite
|
Sign up to set email alerts
|

Deterministic response strategies in a trial-and-error learning task

Abstract: Trial-and-error learning is a universal strategy for establishing which actions are beneficial or harmful in new environments. However, learning stimulus-response associations solely via trial-and-error is often suboptimal, as in many settings dependencies among stimuli and responses can be exploited to increase learning efficiency. Previous studies have shown that in settings featuring such dependencies, humans typically engage high-level cognitive processes and employ advanced learning strategies to improve … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(15 citation statements)
references
References 48 publications
(68 reference statements)
0
15
0
Order By: Relevance
“…We also speculate that some subjects may covertly name or label stimuli early in the block, especially in the higher set sizes, and associate those labels with their guessesstrategies like this could appear as early as the first iteration because subjects are informed about the upcoming set size before each block begins and shown a preview of the full set of images. In some situations, subjects could even perform deterministic hypothesis-testing strategies in the early phases of a learning block (e.g., trying each finger from left to right; Mohr et al, 2018). Despite these caveats and alternative learning strategies, our model was still able to closely approximate the time course of subjects' reaction times and choices (Figs.…”
Section: Limitationsmentioning
confidence: 95%
“…We also speculate that some subjects may covertly name or label stimuli early in the block, especially in the higher set sizes, and associate those labels with their guessesstrategies like this could appear as early as the first iteration because subjects are informed about the upcoming set size before each block begins and shown a preview of the full set of images. In some situations, subjects could even perform deterministic hypothesis-testing strategies in the early phases of a learning block (e.g., trying each finger from left to right; Mohr et al, 2018). Despite these caveats and alternative learning strategies, our model was still able to closely approximate the time course of subjects' reaction times and choices (Figs.…”
Section: Limitationsmentioning
confidence: 95%
“…gambler's fallacy, in which losing in the past is though to predict a better chance of winning in the future), or intricate inter-trial patterns (e.g. switch after 2 wins or 4 losses) can be more difficult to identify 59 . Unfortunately, when behavioral response patterns are analyzed within a limited scope along a continuum of being either MB or MF, non-RL strategies are necessarily pressed into the singular axis of MF/MB.…”
Section: Mf Behaviour Can Look Mb and Vice Versa-despite The Ubiquity Of Mb Controlmentioning
confidence: 99%
“…In summary, there are numerous axes along which learning and decision making vary, identified through various traditions of research (e.g psychology, AI and neuroscience). Future research should carry on identifying these axes, and recent work has made much progress identifying many additional dimensions of learning capture other important sources of variance in how we learn, such as meta-learning mechanisms 137,138 , learning to use attention 73,139,140 , strategic learning 59 , and uncertainty-dependent parameter changes 62,141,142 . This is evidence that learning and decision making vary along numerous dimensions that cannot be reduced to a simple two-dimensional principal component space, whether that axis is labelled as MB/MF, hot/cold, goal-directed vs. habitual, or otherwise.…”
Section: Paths Forwardmentioning
confidence: 99%
“…Episodic binding would create compounds between obstacles on the road ahead and pushing the left lever (i.e., the intended but not executed correct response) as well as between pushing the right lever (i.e., the actually executed erroneous action) and the moving windshield wipers. This ensures that re-encountering situations would retrieve intended correct responses although actions are represented with their effects on the environment, regardless of whether the action was appropriate or inappropriate, mirroring the adaptive properties of higher-level processes during error-based learning ( Mohr et al, 2018 ).…”
mentioning
confidence: 99%