We derive a family of risk-sensitive reinforcement learning methods for agents, who face sequential decision-making tasks in uncertain environments. By applying a utility function to the temporal difference (TD) error, nonlinear transformations are effectively applied not only to the received rewards but also to the true transition probabilities of the underlying Markov decision process. When appropriate utility functions are chosen, the agents' behaviors express key features of human behavior as predicted by prospect theory (Kahneman & Tversky, 1979 ), for example, different risk preferences for gains and losses, as well as the shape of subjective probability curves. We derive a risk-sensitive Q-learning algorithm, which is necessary for modeling human behavior when transition probabilities are unknown, and prove its convergence. As a proof of principle for the applicability of the new framework, we apply it to quantify human behavior in a sequential investment task. We find that the risk-sensitive variant provides a significantly better fit to the behavioral data and that it leads to an interpretation of the subject's responses that is indeed consistent with prospect theory. The analysis of simultaneously measured fMRI signals shows a significant correlation of the risk-sensitive TD error with BOLD signal change in the ventral striatum. In addition we find a significant correlation of the risk-sensitive Q-values with neural activity in the striatum, cingulate cortex, and insula that is not present if standard Q-values are used.
BackgroundThe formation of an odor percept in humans is strongly associated with visual information. However, much less is known about the roles of learning and memory in shaping the multisensory nature of odor representations in the brain.MethodThe dynamics of odor and visual association in olfaction was investigated using three functional magnetic resonance imaging (fMRI) paradigms. In two paradigms, a visual cue was paired with an odor. In the third, the same visual cue was never paired with an odor. In this experimental design, if the visual cue was not influenced by odor–visual pairing, then the blood‐oxygen‐level‐dependent (BOLD) signal elicited by subsequent visual cues should be similar across all three paradigms. Additionally, intensity, a major dimension of odor perception, was used as a modulator of associative learning which was characterized in terms of the spatiotemporal behavior of the BOLD signal in olfactory structures.ResultsA single odor–visual pairing cue could subsequently induce primary olfactory cortex activity when only the visual cue was presented. This activity was intensity dependent and was also detected in secondary olfactory structures and hippocampus.ConclusionThis study provides evidence for a rapid learning response in the olfactory system by a visual cue following odor and visual cue pairing. The novel data and paradigms suggest new avenues to explore the dynamics of odor learning and multisensory representations that contribute to the construction of a unified odor percept in the human brain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.