Model-Based and Model-Free Decisions in Alcohol Dependence

Sebold, Miriam; Deserno, Lorenz; Nebe, Stephan; Schad, Daniel J.; Garbusow, Maria; Hägele, Claudia; Keller, Jürgen; Jünger, Elisabeth; Kathmann, Norbert; Smolka, Michael N.; Rapp, Michael A.; Schlagenhauf, Florian; Heinz, Andreas; Huys, Quentin J. M.

doi:10.1159/000362840

Cited by 165 publications

(195 citation statements)

References 72 publications

Supporting

Mentioning

183

Contrasting

Order By: Relevance

“…We have previously observed increased PIT and reduced MB decisions in alcohol-dependent patients (Garbusow et al, , 2015Sebold et al, 2014) and hence expected PIT effects overall to be driven more by MF learning and to covary negatively with MB control. On the basis of these findings, we expected decreased MB but enhanced MF behavior in those participants with higher PIT effects.…”

Section: Introductionmentioning

confidence: 92%

“…Each participant performed 201 trials of the two-step decision-making task described by Sebold et al (2014; see Figure 2A). In each trial, participants had to perform an initial choice between two stimuli on a gray background.…”

Section: Two-step Taskmentioning

confidence: 99%

“…After an outcome devaluation (e.g., through satiation), the MB system can change preferences quickly, but the MF system cannot. Individual variation in the balance between MB and MF decisions, with a shift toward MF and away from MB learning, is associated with addictive and impulsive traits in animals (Huys et al, 2014;Everitt & Robbins, 2005), and a bias has been reported in conditions such as addiction and obsessive-compulsive disorder where behavioral preferences persist against explicit desires (Voon et al, , 2015Gillan et al, 2011Gillan et al, , 2014Sebold et al, 2014;Sjoerds et al, 2013).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Don't Think, Just Feel the Music: Individuals with Strong Pavlovian-to-Instrumental Transfer Effects Rely Less on Model-based Reinforcement Learning

Sebold

Schad

Nebe

et al. 2016

Journal of Cognitive Neuroscience

Self Cite

View full text Add to dashboard Cite

Behavioral choice can be characterized along two axes. One axis distinguishes reflexive, modelfree systems that slowly accumulate values through experience and a model-based system that uses knowledge to reason prospectively. The second axis distinguishes Pavlovian valuation of stimuli from instrumental valuation of actions or stimulus-action pairs. This results in four values and many possible interactions between them, with important consequences for accounts of individual variation. We here explored whether individual variation along one axis was related to individual variation along the other. Specifically, we asked whether individuals' balance between model-based and model-free learning was related to their tendency to show Pavlovian interferences with instrumental decisions. In two independent samples with a total of 243 participants, Pavlovian-instrumental transfer effects were negatively correlated with the strength of model-based reasoning in a two-step task. This suggests a potential common underlying substrate predisposing individuals to both have strong Pavlovian interference and be less model-based and provides a framework within which to interpret the observation of both effects in addiction. Abstract ■ Behavioral choice can be characterized along two axes. One axis distinguishes reflexive, model-free systems that slowly accumulate values through experience and a model-based system that uses knowledge to reason prospectively. The second axis distinguishes Pavlovian valuation of stimuli from instrumental valuation of actions or stimulus-action pairs. This results in four values and many possible interactions between them, with important consequences for accounts of individual variation. We here explored whether individual variation along one axis was related to individual variation along the other. Specifically, we asked whether individuals' balance between model-based and model-free learning was related to their tendency to show Pavlovian interferences with instrumental decisions. In two independent samples with a total of 243 participants, Pavlovianinstrumental transfer effects were negatively correlated with the strength of model-based reasoning in a two-step task. This suggests a potential common underlying substrate predisposing individuals to both have strong Pavlovian interference and be less model-based and provides a framework within which to interpret the observation of both effects in addiction. ■

show abstract

Section: Introductionmentioning

confidence: 92%

Section: Two-step Taskmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Don't Think, Just Feel the Music: Individuals with Strong Pavlovian-to-Instrumental Transfer Effects Rely Less on Model-based Reinforcement Learning

Sebold

Schad

Nebe

et al. 2016

Journal of Cognitive Neuroscience

Self Cite

View full text Add to dashboard Cite

show abstract

“…The relative contributions of these mechanisms are projected onto a weighting parameter of their computational model. This task has provided interesting findings that show that the bias in one system relates to disorders that involve compulsion (Voon et al, 2015) and alcohol dependence (Sebold et al, 2014), working memory capacity (Otto, Gershman, Markman, & Daw, 2013;Otto, Raio, Chiang, Phelps, & Daw, 2013), and individual traits such as extraversion (Skatova, Chan, & Daw, 2013). In addition, neural substrates that are critical to these learning systems have been searched for under this framework (Gläscher, Daw, Dayan, & O'Doherty, 2010;Smittenaar, FitzGerald, Romei, Wright, & Dolan, 2013;Wunderlich, Smittenaar, & Dolan, 2012).…”

mentioning

confidence: 99%

A simple computational algorithm of model-based choice preference

Toyama

Katahira

Ohira

2017

Cogn Affect Behav Neurosci

View full text Add to dashboard Cite

A broadly used computational framework posits that two learning systems operate in parallel during the learning of choice preferences-namely, the model-free and modelbased reinforcement-learning systems. In this study, we examined another possibility, through which model-free learning is the basic system and model-based information is its modulator. Accordingly, we proposed several modified versions of a temporal-difference learning model to explain the choice-learning process. Using the two-stage decision task developed by Daw, Gershman, Seymour, Dayan, and Dolan (2011), we compared their original computational model, which assumes a parallel learning process, and our proposed models, which assume a sequential learning process. Choice data from 23 participants showed a better fit with the proposed models. More specifically, the proposed eligibility adjustment model, which assumes that the environmental model can weight the degree of the eligibility trace, can explain choices better under both model-free and model-based controls and has a simpler computational algorithm than the original model. In addition, the forgetting learning model and its variation, which assume changes in the values of unchosen actions, substantially improved the fits to the data. Overall, we show that a hybrid computational model best fits the data. The parameters used in this model succeed in capturing individual tendencies with respect to both model use in learning and exploration behavior. This computational model provides novel insights into learning with interacting model-free and model-based components. Keywords Computational model . Model-free . Model-based . Eligibility trace . Reinforcement learningOne common theoretical framework is that value-based decision-making is realized using two distinct cognitive or learning systems: One is habitual and inflexible and requires little computation, whereas the other is deliberative and accurate and requires heavy computation (Dickinson, 1985;Kahneman, 2010;Redish, Jensen, & Johnson, 2008). In the field of instrumental learning, these two systems correspond to the model-free and model-based learning systems, respectively (Daw, Niv, & Dayan, 2005;Dolan & Dayan, 2013;Gillan, Otto, Phelps, & Daw, 2015). Prediction that is based on model-free learning is analogous to Thorndike's law of effect, in which a behavior that is followed by a pleasant outcome is likely to be repeated, whereas a behavior that is followed by an unpleasant outcome is likely to be inhibited (Thorndike, 1911). In contrast, the model-based learning system uses the agent's internal model, or cognitive map (Tolman, 1948), of a structure in the environment to dynamically change a behavior by propagating information to all states and actions, including those that have not previously been experienced. However, it has yet to be determined how humans and animals shape a preference that is based on these learning systems and how the interaction of these systems is implemented.Electronic supplementary material The online v...

show abstract

“…As part of a research group funded by the Deutsche Forschungsgemeinschaft, Garbusow et al [21] present first results from studies specifically targeting pavlovian-to-instrumental transfer, and Sebold et al [22 ]present habitual versus goal-directed behaviour in patients with alcohol dependence and healthy controls. Their data suggest that there is a prominent pavlovian-to-instrumental transfer effect in patients with alcohol dependence, which appears to be stronger than in healthy controls and may thus contribute to cue-induced relapse.…”

mentioning

confidence: 99%