Pupil Dilation and Response Slowing Distinguish Deliberate Explorative Choices in the Probabilistic Learning Task

Kozunova, Galina L.; Sayfulina, Ksenia E.; Prokofyev, Andrey O.; Medvedev, Vladimir; Rytikova, A.; Stroganova, Tatiana A.; Владимирович, Чернышев Борис

doi:10.1101/2021.10.19.464963

Cited by 1 publication

(5 citation statements)

References 72 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Crucially, the magnitude of a large-scale β suppression, when a subject formed his/her decision reliably, predicted decision costs (with response time taken as a measurable proxy of this internal variable) on a single trial basis. Given multiple evidence associating the strength of β suppression with greater cognitive and attentional efforts 16–18 , this finding strongly suggests that such a choice requires additional resources to overcome the internal utility model favoring the advantageous alternative and strongly supports the hypothesis of its deliberately explorative nature 13,33 .…”

Section: Discussionsupporting

confidence: 58%

“…This conflict arises between at least two simultaneously active competing internal models, or ‘task sets’ 9,12 – one being a predominant response tendency (exploitation), and the other – its conscious alternative (exploration). Our recent pupillometric study lends support to this assumption 13 : we found that such explorative choices compared to exploitative ones are accompanied by larger pupil dilation and longer decision time. We speculated that this state of conflict supposedly entails an increase in the degree of processing required to make the deliberately explorative decisions.…”

Section: Introductionsupporting

confidence: 66%

“…We addressed the influence of the previous feedback because it affects behavioral and pupillometric measures on the next trial 13…”

Section: Discussionmentioning

confidence: 99%

“…A recent pupillometric study 13 revealed that advantageous choices that immediately preceded and immediately followed exploratory choices significantly differed from the advantageous choices committed within the periods of continuous exploitation. Thus, we used the following four levels of Choice type factor (Fig.…”

Section: Methodsmentioning

confidence: 97%

“…Finally, we aimed to evaluate how the immediate history of punishments and rewards in the previous trial affected the brain response to the feedback on the current trial. We addressed the influence of the previous feedback because it affects behavioral and pupillometric measures on the next trial 13 . An additional factor Previous feedback corresponding to the outcome of the previous trial (two levels: punishment and reward) was included into the LMM model: …”

Section: Methodsmentioning

confidence: 99%

See 4 more Smart Citations

Losses resulting from deliberate exploration trigger beta oscillations in frontal cortex

Владимирович

Pultsina

Tretyakova

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

We examined the neural signature of directed exploration by contrasting MEG beta(16-30 Hz) power changes between disadvantageous and advantageous choices in the two-choice probabilistic reward task. Both types of choices were made when our participants learned the probabilistic contingency between choices and their outcomes, i.e., acquired the inner model of choice value. Therefore, rare disadvantageous choices might serve exploratory, environment-probing purposes. The study brought two main findings. Firstly, decision making leading to disadvantageous choices took more time and evidenced greater large-scale suppression of beta oscillations than its advantageous alternative. Additional neural resources required by disadvantageous decisions strongly suggest their deliberately explorative nature. Secondly, an outcome of disadvantageous and advantageous choices had qualitatively different impact on feedback-related beta oscillations. Only losses, but not gains, resulting from the disadvantageous choice were followed by late beta synchronization in frontal cortex. Our results are consistent with the role of frontal beta oscillations in the stabilization of neural representations for selected behavioral rule when exploratory strategy conflicts with value-based behavior. Punishment for exploratory choice being congruent with its low value in the reward history is more likely to strengthen, through punishment-related beta oscillations, the representation of its competitor - the inner utility model.

show abstract