The authors present a unified account of 2 neural systems concerned with the development and expression of adaptive behaviors: a mesencephalic dopamine system for reinforcement learning and a "generic" error-processing system associated with the anterior cingulate cortex. The existence of the error-processing system has been inferred from the error-related negativity (ERN), a component of the event-related brain potential elicited when human participants commit errors in reaction-time tasks. The authors propose that the ERN is generated when a negative reinforcement learning signal is conveyed to the anterior cingulate cortex via the mesencephalic dopamine system and that this signal is used by the anterior cingulate cortex to modify performance on the task at hand. They provide support for this proposal using both computational modeling and psychophysiological experimentation.Human beings learn from the consequences of their actions. Thorndike (1911Thorndike ( /1970 originally described this phenomenon with his law of effect, which made explicit the commonsense notion that actions that are followed by feelings of satisfaction are more likely to be generated again in the future, whereas actions that are followed by negative outcomes are less likely to reoccur. This fundamental reinforcement learning principle has been developed by the artificial intelligence community into a body of algorithms used to train autonomous systems to operate independently in complex and uncertain environments (Barto & Sutton, 1997;Sutton & Barto, 1998). Research has also evaluated the neural mechanisms underlying reinforcement learning in biological systems, but these mechanisms are still poorly understood.In this article, we provide a framework for understanding the neural basis of reinforcement learning in humans. Our proposal links together two areas of research that have, until now, been considered separately. On the one hand, we have previously inferred the existence of a generic, high-level error-processing system in humans from the error-related negativity (ERN), a negative deflection in the ongoing electroencephalogram (EEG) seen when human participants commit errors in a wide variety of psychological tasks. The ERN appears to be generated in the anterior cingulate cortex. On the other hand, other researchers have argued that the mesencephalic dopamine system conveys reinforcement learning signals to the basal ganglia and frontal cortex, where they are used to facilitate the development of adaptive motor programs. Although the reinforcement learning function attributed to the mesencephalic dopamine system and the error-processing function associated with the ERN appear to be concerned with the same problem-namely, evaluating the appropriateness of ongoing events, and using that information to facilitate the development and expression of adaptive behaviors-a possible relationship between these two systems remains to be explored.In this article, we propose a hypothesis that unifies the two accounts by explicitly linking the gen...
Humans can monitor actions and compensate for errors. Analysis of the human event-related brain potentials (ERPs) accompanying errors provides evidence for a neural process whose activity is specifically associated with monitoring and compensating for erroneous behavior. This error-related activity is enhanced when subjects strive for accurate performance but is diminished when response speed is emphasized at the expense of accuracy. The activity is also related to attempts to compensate for the erroneous behavior.
To understand the endogenous components of the event-related brain potential (ERP), we must use data about the components' antecedent conditions to form hypotheses about the information-processing function of the underlying brain activity. These hypotheses, in turn, generate testable predictions about the consequences of the component. We review the application of this approach to the analysis of the P300 component. The amplitude of the P300 is controlled multiplicatively by the subjective probability and the task relevance of the eliciting events, whereas its latency depends on the duration of stimulus evaluation. These and other factors suggest that the P300 is a manifestation of activity occurring whenever one's model of the environment must be revised. Tests of three predictions based on this “context updating” model are reviewed. Verleger's critique is based on a misconstrual of the model as well as a partial and misleading reading of the relevant literature.
Recent studies indicate that subjects may respond to visual information during either an early parallel phase or a later focused phase and that the selection of the relevant phase is data driven. Using the noise-compatibility paradigm, we tested the hypothesis that this selection may also be strategic and context driven. At least part of the interference effect observed in this paradigm is due to response activation during the parallel-processing phase. We manipulated subjects' expectancies for compatible and incompatible noise in 4 experiments and effectively modulated the interference effect. The results suggest that expectancies about the relative utility of the information extracted during the parallel and focused phases determine which phase is used to activate responses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.