Social learning is a robust phenomenon whereby individuals learn from others by means of observation, imitation, or compliance with others’ instructions or advice. However, the computational mechanism underlying advice-taking is still largely unknown. The current study adds to the social-learning literature by disentangling the effects of two types of advice-taking on choice behavior: informed advice-taking (learning the value of following advice), and non-informed advice-taking (a fixed and constant bias to follow advice regardless of one’s own experience). To accomplish this aim, 153 participants completed a reinforcement learning task across two sessions, during which they were asked to make choices to gain rewards. Participants were either presented with advice from an artificial teacher (60% of the trials) or not (40%), allowing us to sample choice behavior both with and without the presence of advice. First, we found a strong and reliable tendency to follow advice above and beyond the effect of individual value learning. Computational modeling was then used to explore the contribution of different learning strategies to predicting choice behavior. Modeling results suggested that the observed behavior was influenced by three distinct processes, including: (a) individual learning (i.e., learning the value of specific actions, independent of advice), (b) informed advice-taking (i.e., learning the value of following advice), and (c) non-informed advice-taking (i.e., a fixed and constant bias to follow advice regardless of choice and outcome history). We additionally provide distinct regression signatures in simulated and empirical data that are unique to informed and non-informed advice-taking. Finally, the tendency to follow advice was consistent at the level of a single individual, with a good test-retest reliability between sessions, suggesting that conformity bias is akin to a trait. We discuss the theoretical implications of integrating internal and external information during the learning process.
Current studies suggest that individuals estimate the value of their choices based on observed feedback. Here, we ask whether individuals update the value of their unchosen actions, even when the associated feedback remains unknown. Two hundred and three individuals completed a multi-armed bandit task, making choices to gain rewards. We found robust evidence suggesting inverse value updating for unchosen actions based on the chosen action’s outcome. Computational modeling results suggested that this effect is mainly explained by a value updating mechanism whereby individuals integrate the outcome history for choosing an option with that of avoiding the alternative. Properties of the deliberation (i.e., duration/difficulty) did not moderate the latent value updating of unchosen actions, suggesting that memory traces generated during deliberation take a smaller role in this phenomenon than previously thought. We discuss the mechanisms facilitating credit assignment to unchosen actions and their implications for human decision-making.
To establish accurate action-outcome associations in the environment, individuals must refrain from assigning value to outcome-irrelevant features. However, reinforcement learning studies have largely ignored the role of attentional control processes on credit assignment (the process of assigning value to one’s actions). In the current study, we examined the extent to which working memory – a system that can filter and block the processing of irrelevant information in one’s mind – predicted credit assignment to outcome-irrelevant task features. One hundred and seventy-four individuals completed working memory capacity and outcome-irrelevant learning estimates. Outcome-irrelevant learning was estimated in a reinforcement learning task where only stimulus’ visual features predicted reward, but not the response keys used to indicate one’s selection. As expected, we found a consistent tendency to assign value to the tasks’ response keys, reflecting outcome-irrelevant learning at the group level. However, we also found substantial individual differences, such that only 55% of participants demonstrated this effect. Importantly, working memory capacity significantly moderated individual differences in outcome-irrelevant learning; individuals with higher capacity were less likely to assign credit to the outcome-irrelevant feature (i.e., response key). We discuss the influence of working memory on outcome-irrelevant learning through the perspective of cognitive control failure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.