When humans are offered the choice between rewards available at different points in time, the relative values of the options are discounted according to their expected delays until delivery. Using functional magnetic resonance imaging, we examined the neural correlates of time discounting while subjects made a series of choices between monetary reward options that varied by delay to delivery. We demonstrate that two separate systems are involved in such decisions. Parts of the limbic system associated with the midbrain dopamine system, including paralimbic cortex, are preferentially activated by decisions involving immediately available rewards. In contrast, regions of the lateral prefrontal cortex and posterior parietal cortex are engaged uniformly by intertemporal choices irrespective of delay. Furthermore, the relative engagement of the two systems is directly associated with subjects' choices, with greater relative fronto-parietal activity when subjects choose longer term options.
Many large and small decisions we make in our daily lives-which ice cream to choose, what research projects to pursue, which partner to marry-require an exploration of alternatives before committing to and exploiting the benefits of a particular choice. Furthermore, many decisions require re-evaluation, and further exploration of alternatives, in the face of changing needs or circumstances. That is, often our decisions depend on a higher level choice: whether to exploit well known but possibly suboptimal alternatives or to explore risky but potentially more profitable ones. How adaptive agents choose between exploitation and exploration remains an important and open question that has received relatively limited attention in the behavioural and brain sciences. The choice could depend on a number of factors, including the familiarity of the environment, how quickly the environment is likely to change and the relative value of exploiting known sources of reward versus the cost of reducing uncertainty through exploration. There is no known generally optimal solution to the exploration versus exploitation problem, and a solution to the general case may indeed not be possible. However, there have been formal analyses of the optimal policy under constrained circumstances. There have also been specific suggestions of how humans and animals may respond to this problem under particular experimental conditions as well as proposals about the brain mechanisms involved. Here, we provide a brief review of this work, discuss how exploration and exploitation may be mediated in the brain and highlight some promising future directions for research.
Previous research, involving monetary rewards, found that limbic reward-related areas show greater activity when an intertemporal choice includes an immediate reward than when the options include only delayed rewards. In contrast, the lateral prefrontal and parietal cortex (areas commonly associated with deliberative cognitive processes, including future planning) respond to intertemporal choices in general but do not exhibit sensitivity to immediacy (McClure et al., 2004). The current experiments extend these findings to primary rewards (fruit juice or water) and time delays of minutes instead of weeks. Thirsty subjects choose between small volumes of drinks delivered at precise times during the experiment (e.g., 2 ml now vs 3 ml in 5 min). Consistent with previous findings, limbic activation was greater for choices between an immediate reward and a delayed reward than for choices between two delayed rewards, whereas the lateral prefrontal cortex and posterior parietal cortex responded similarly whether choices were between an immediate and a delayed reward or between two delayed rewards. Moreover, relative activation of the two sets of brain regions predicts actual choice behavior. A second experiment finds that when the delivery of all rewards is offset by 10 min (so that the earliest available juice reward in any choice is 10 min), no differential activity is observed in limbic reward-related areas for choices involving the earliest versus only more delayed rewards. We discuss implications of this finding for differences between primary and secondary rewards.
Functional MRI experiments in human subjects strongly suggest that the striatum participates in processing information about the predictability of rewarding stimuli. However, stimuli can be unpredictable in character (what stimulus arrives next), unpredictable in time (when the stimulus arrives), and unpredictable in amount (how much arrives). These variables have not been dissociated in previous imaging work in humans, thus conflating possible interpretations of the kinds of expectation errors driving the measured brain responses. Using a passive conditioning task and fMRI in human subjects, we show that positive and negative prediction errors in reward delivery time correlate with BOLD changes in human striatum, with the strongest activation lateralized to the left putamen. For the negative prediction error, the brain response was elicited by expectations only and not by stimuli presented directly; that is, we measured the brain response to nothing delivered (juice expected but not delivered) contrasted with nothing delivered (nothing expected).
Coca-Cola (Coke) and Pepsi are nearly identical in chemical composition, yet humans routinely display strong subjective preferences for one or the other. This simple observation raises the important question of how cultural messages combine with content to shape our perceptions; even to the point of modifying behavioral preferences for a primary reward like a sugared drink. We delivered Coke and Pepsi to human subjects in behavioral taste tests and also in passive experiments carried out during functional magnetic resonance imaging (fMRI). Two conditions were examined: (1) anonymous delivery of Coke and Pepsi and (2) brand-cued delivery of Coke and Pepsi. For the anonymous task, we report a consistent neural response in the ventromedial prefrontal cortex that correlated with subjects' behavioral preferences for these beverages. In the brand-cued experiment, brand knowledge for one of the drinks had a dramatic influence on expressed behavioral preferences and on the measured brain responses.
Curiosity has been described as the "wick in the candle of learning" but its underlying mechanisms are not well-understood. We scanned subjects with fMRI while they read trivia questions. The level of curiosity when reading questions is correlated with activity in caudate regions previously suggested to be involved in anticipated reward or encoding prediction error. This finding led to a behavioral study showing that subjects spend more scarce resources (either limited tokens, or waiting time) to find out answers when they are more curious. The fMRI also showed that curiosity increases activity in memory areas when subjects guess incorrectly, which suggests that curiosity may enhance memory for surprising new information. This prediction about memory enhancement is confirmed in a behavioral study-higher curiosity in the initial session is correlated with better recall of surprising answers 10 days later.Keywords: Neuroimaging, Memory, Learning, Brain 2 Curiosity is the complex feeling and cognition accompanying the desire to learn what is unknown. Curiosity can be both helpful and dangerous. It plays a critical role in motivating learning and discovery, especially by creative professionals, increasing the world's store of knowledge. Einstein, for example, once said, "I have no special talents. I am only passionately curious (Hoffmann, 1972)." The dangerous side of curiosity is its association with exploratory behaviors with harmful consequences. An ancient example is the mythical Pandora, who opened a box that unleashed misfortunes on the world. In modern times, technology such as the Internet augments both good and bad effects of curiosity, by putting both enormous amounts of information and potentially dangerous social encounters a mouse click away.Despite its importance, the psychological and neural underpinnings of human curiosity remain poorly understood. Philosophers and psychologists have described curiosity as an appetite for knowledge, a drive like hunger and thirst (Loewenstein, 1994), the hunger pang of an 'info-vore' (Biederman & Vessel, 2006), and "the wick in the candle of learning" (William Arthur Ward). In reinforcement learning a "novelty bonus" is used to motivate the choice of unexplored strategies (Kakade & Dayan, 2002).Curiosity can be thought of as the psychological manifestation of such a novelty bonus.A theory guiding our research holds that curiosity arises from an incongruity or 'information gap'-a discrepancy between what one knows and what one wants to know (Loewenstein, 1994). The theory assumes that the aspired level of knowledge increases sharply with a small increase in knowledge, so that the information gap grows with initial learning. When one is sufficiently knowledgeable, however, the gap shrinks and curiosity falls. If curiosity is like a hunger for knowledge, then a small "priming dose" of information increases the hunger, and the decrease in curiosity from knowing a lot is like being satiated by information.In the information-gap theory, the object of curiosity is a...
Current theories hypothesize that dopamine neuronal firing encodes reward prediction errors. Although studies in nonhuman species provide direct support for this theory, functional magnetic resonance imaging (fMRI) studies in humans have focused on brain areas targeted by dopamine neurons [ventral striatum (VStr)] rather than on brainstem dopaminergic nuclei [ventral tegmental area (VTA) and substantia nigra]. We used fMRI tailored to directly image the brainstem. When primary rewards were used in an experiment, the VTA blood oxygen level-dependent (BOLD) response reflected a positive reward prediction error, whereas the VStr encoded positive and negative reward prediction errors. When monetary gains and losses were used, VTA BOLD responses reflected positive reward prediction errors modulated by the probability of winning. We detected no significant VTA BOLD response to nonrewarding events.
Certain classes of stimuli, such as food and drugs, are highly effective in activating reward regions. We show in humans that activity in these regions can be modulated by the predictability of the sequenced delivery of two mildly pleasurable stimuli, orally delivered fruit juice and water. Using functional magnetic resonance imaging, the activity for rewarding stimuli in both the nucleus accumbens and medial orbitofrontal cortex was greatest when the stimuli were unpredictable. Moreover, the subjects' stated preference for either juice or water was not directly correlated with activity in reward regions but instead was correlated with activity in sensorimotor cortex. For pleasurable stimuli, these findings suggest that predictability modulates the response of human reward regions, and subjective preference can be dissociated from this response.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.