It has been suggested that dopamine (DA) represents reward-prediction-error (RPE) defined in reinforcement learning and therefore DA responds to unpredicted but not predicted reward. However, recent studies have found DA response sustained towards predictable reward in tasks involving self-paced behavior, and suggested that this response represents a motivational signal. We have previously shown that RPE can sustain if there is decay/forgetting of learned-values, which can be implemented as decay of synaptic strengths storing learned-values. This account, however, did not explain the suggested link between tonic/sustained DA and motivation. In the present work, we explored the motivational effects of the value-decay in self-paced approach behavior, modeled as a series of ‘Go’ or ‘No-Go’ selections towards a goal. Through simulations, we found that the value-decay can enhance motivation, specifically, facilitate fast goal-reaching, albeit counterintuitively. Mathematical analyses revealed that underlying potential mechanisms are twofold: (1) decay-induced sustained RPE creates a gradient of ‘Go’ values towards a goal, and (2) value-contrasts between ‘Go’ and ‘No-Go’ are generated because while chosen values are continually updated, unchosen values simply decay. Our model provides potential explanations for the key experimental findings that suggest DA's roles in motivation: (i) slowdown of behavior by post-training blockade of DA signaling, (ii) observations that DA blockade severely impairs effortful actions to obtain rewards while largely sparing seeking of easily obtainable rewards, and (iii) relationships between the reward amount, the level of motivation reflected in the speed of behavior, and the average level of DA. These results indicate that reinforcement learning with value-decay, or forgetting, provides a parsimonious mechanistic account for the DA's roles in value-learning and motivation. Our results also suggest that when biological systems for value-learning are active even though learning has apparently converged, the systems might be in a state of dynamic equilibrium, where learning and forgetting are balanced.
It has been suggested that the midbrain dopamine (DA) neurons, receiving inputs from the cortico-basal ganglia (CBG) circuits and the brainstem, compute reward prediction error (RPE), the difference between reward obtained or expected to be obtained and reward that had been expected to be obtained. These reward expectations are suggested to be stored in the CBG synapses and updated according to RPE through synaptic plasticity, which is induced by released DA. These together constitute the “DA=RPE” hypothesis, which describes the mutual interaction between DA and the CBG circuits and serves as the primary working hypothesis in studying reward learning and value-based decision-making. However, recent work has revealed a new type of DA signal that appears not to represent RPE. Specifically, it has been found in a reward-associated maze task that striatal DA concentration primarily shows a gradual increase toward the goal. We explored whether such ramping DA could be explained by extending the “DA=RPE” hypothesis by taking into account biological properties of the CBG circuits. In particular, we examined effects of possible time-dependent decay of DA-dependent plastic changes of synaptic strengths by incorporating decay of learned values into the RPE-based reinforcement learning model and simulating reward learning tasks. We then found that incorporation of such a decay dramatically changes the model's behavior, causing gradual ramping of RPE. Moreover, we further incorporated magnitude-dependence of the rate of decay, which could potentially be in accord with some past observations, and found that near-sigmoidal ramping of RPE, resembling the observed DA ramping, could then occur. Given that synaptic decay can be useful for flexibly reversing and updating the learned reward associations, especially in case the baseline DA is low and encoding of negative RPE by DA is limited, the observed DA ramping would be indicative of the operation of such flexible reward learning.
Background Smoking cessation helps extend a healthy life span and reduces medical expenses. However, the standard 12-week smoking cessation program in Japan has several notable problems. First, only 30% of participants complete this program. Second, participants may choose not to participate unless they have a strong motivation to quit smoking, such as health problems. Third, the program does not provide enough support during the period between clinical visits and after 12 weeks. Objective This study examined the efficacy of the 24-week ascure program to address the problems of accessibility and continuous support. The program combines online mentoring, over-the-counter pharmacotherapy, and a smartphone app. Methods Using a retrospective study design, we investigated data for 177 adult smokers who were enrolled in the ascure smoking cessation program between August 2017 and August 2018. The primary outcomes were continuous abstinence rates (CARs) during weeks 9-12 and weeks 21-24. To confirm smoking status, we performed salivary cotinine testing at weeks 12 and 24. We also evaluated the program adherence rate. Finally, we performed exploratory analysis to determine the factors associated with continuous abstinence at weeks 21-24 to provide insights for assisting with long-term continuous abstinence. Results The CARs of all participants for weeks 9-12 and weeks 21-24 were 48.6% (95% CI 41.2-56.0) and 47.5% (95% CI 40.0-54.8), respectively. Program adherence rates were relatively high throughout (72% at week 12 and 60% at week 24). In the analysis of the factors related to the CAR at weeks 21-24, the number of entries in the app’s digital diary and number of educational videos watched during the first 12 weeks were significant factors. Conclusions The ascure program achieved favorable CARs, and participants showed high adherence. Proactive usage of the smartphone app may help contribute to smoking cessation success in the long-term.
HighlightsMutation of Cys227 in mouse AQP11 has been reported to cause kidney injury.The importance of Cys227 for the molecular function of AQP11 has yet to be elucidated.We examined the molecular function of Cys227 AQP11 in transfected mammalian cells.The mutation at Cys227 caused an increase in the water permeability of AQP11.Cys227 is considered to be crucial for the proper molecular function of AQP11.
Difficulty in cessation of drinking, smoking, or gambling has been widely recognized. Conventional theories proposed relative dominance of habitual over goal‐directed control, but human studies have not convincingly supported them. Referring to the recently suggested “successor representation (SR)” of states that enables partially goal‐directed control, we propose a dopamine‐related mechanism that makes resistance to habitual reward‐obtaining particularly difficult. We considered that long‐standing behavior towards a certain reward without resisting temptation can (but not always) lead to a formation of rigid dimension‐reduced SR based on the goal state, which cannot be updated. Then, in our model assuming such rigid reduced SR, whereas no reward prediction error (RPE) is generated at the goal while no resistance is made, a sustained large positive RPE is generated upon goal reaching once the person starts resisting temptation. Such sustained RPE is somewhat similar to the hypothesized sustained fictitious RPE caused by drug‐induced dopamine. In contrast, if rigid reduced SR is not formed and states are represented individually as in simple reinforcement learning models, no sustained RPE is generated at the goal. Formation of rigid reduced SR also attenuates the resistance‐dependent decrease in the value of the cue for behavior, makes subsequent introduction of punishment after the goal ineffective, and potentially enhances the propensity of nonresistance through the influence of RPEs via the spiral striatum‐midbrain circuit. These results suggest that formation of rigid reduced SR makes cessation of habitual reward‐obtaining particularly difficult and can thus be a mechanism for addiction, common to substance and nonsubstance reward.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.