Kenji Doya scite author profile

Recent empirical studies have implicated the use of the motor system during action observation, imitation and social interaction. In this paper, we explore the computational parallels between the processes that occur in motor control and in action observation, imitation, social interaction and theory of mind. In particular, we examine the extent to which motor commands acting on the body can be equated with communicative signals acting on other people and suggest that computational solutions for motor control may have been extended to the domain of social interaction.

show abstract

Representation of Action-Specific Reward Values in the Striatum

Samejima

Ueda

Doya

et al. 2005

Science

813

782

View full text Add to dashboard Cite

The estimation of the reward an action will yield is critical in decision-making. To elucidate the role of the basal ganglia in this process, we recorded striatal neurons of monkeys who chose between left and right handle turns, based on the estimated reward probabilities of the actions. During a delay period before the choices, the activity of more than one-third of striatal projection neurons was selective to the values of one of the two actions. Fewer neurons were tuned to relative values or action choice. These results suggest representation of action values in the striatum, which can guide action selection in the basal ganglia circuit.

show abstract

Reinforcement Learning in Continuous Time and Space

2000

View full text Add to dashboard Cite

This article presents a reinforcement learning framework for continuous-time dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinite-horizon, discounted reward problems, we derive algorithms for estimating value functions and improving policies with the use of function approximators. The process of value function estimation is formulated as the minimization of a continuous-time form of the temporal difference (TD) error. Update methods based on backward Euler approximation and exponential eligibility traces are derived, and their correspondences with the conventional residual gradient, TD(0), and TD(lambda) algorithms are shown. For policy improvement, two methods-a continuous actor-critic method and a value-gradient-based greedy policy-are formulated. As a special case of the latter, a nonlinear feedback control law using the value gradient and the model of the input gain is derived. The advantage updating, a model-free algorithm derived previously, is also formulated in the HJB-based framework. The performance of the proposed algorithms is first tested in a nonlinear control task of swinging a pendulum up with limited torque. It is shown in the simulations that (1) the task is accomplished by the continuous actor-critic method in a number of trials several times fewer than by the conventional discrete actor-critic method; (2) among the continuous policy update methods, the value-gradient-based policy with a known or learned dynamic model performs several times better than the actor-critic method; and (3) a value function update using exponential eligibility traces is more efficient and stable than that based on Euler approximation. The algorithms are then tested in a higher-dimensional task: cart-pole swing-up. This task is accomplished in several hundred trials using the value-gradient-based policy with a learned dynamic model.

show abstract

Complementary roles of basal ganglia and cerebellum in learning and motor control

Doya¹

2000

783

599

View full text Add to dashboard Cite

Parallel neural networks for learning sequential procedures

Hikosaka

Nakahara

Rand

et al. 1999

Trends in Neurosciences

691

550

View full text Add to dashboard Cite

What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?

Doya¹

1999

Neural Networks

646

489

View full text Add to dashboard Cite

Metalearning and neuromodulation

Doya¹

2002

Neural Networks

570

468

View full text Add to dashboard Cite

Modulators of decision making

2008

View full text Add to dashboard Cite

Human and animal decisions are modulated by a variety of environmental and intrinsic contexts. Here I consider computational factors that can affect decision making and review anatomical structures and neurochemical systems that are related to contextual modulation of decision making. Expectation of a high reward can motivate a subject to go for an action despite a large cost, a decision that is influenced by dopamine in the anterior cingulate cortex. Uncertainty of action outcomes can promote risk taking and exploratory choices, in which norepinephrine and the orbitofrontal cortex appear to be involved. Predictable environments should facilitate consideration of longer-delayed rewards, which depends on serotonin in the dorsal striatum and dorsal prefrontal cortex. This article aims to sort out factors that affect the process of decision making from the viewpoint of reinforcement learning theory and to bridge between such computational needs and their neurophysiological substrates.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kenji Doya

A unifying computational framework for motor control and social interaction

Representation of Action-Specific Reward Values in the Striatum

Reinforcement Learning in Continuous Time and Space

Complementary roles of basal ganglia and cerebellum in learning and motor control

Parallel neural networks for learning sequential procedures

What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?

Metalearning and neuromodulation

Modulators of decision making

Contact Info

Product

Resources

About