2021
DOI: 10.1007/978-981-16-1288-6_2
|View full text |Cite
|
Sign up to set email alerts
|

Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL

Abstract: Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an environment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
1

Relationship

3
5

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 31 publications
(41 reference statements)
0
4
0
Order By: Relevance
“…Human behavior inference using reinforcement learning (RL) is particularly common in robotics and contemporary neuroscience. Lin et al (2019) used a model free RL approach to understanding decision making. However, this study focused on abnormal processes in psychiatry.…”
Section: Introductionmentioning
confidence: 99%
“…Human behavior inference using reinforcement learning (RL) is particularly common in robotics and contemporary neuroscience. Lin et al (2019) used a model free RL approach to understanding decision making. However, this study focused on abnormal processes in psychiatry.…”
Section: Introductionmentioning
confidence: 99%
“…Draw attention to neuroscience-inspired algorithms: Like deep learning, many fields in artificial intelligence (AI) benefited from a rich source of inspirations from the neuroscience for architectures and algorithms. While the flow of inspirations from neuroscience to machine learning has been sporadic [ 50 ], it systematically narrows the major gaps between humans and machines: the size of required training datasets [ 51 ], out-of-set generalization [ 52 ], adversarial robustness [ 53 , 54 ], reinforcement learning [ 13 , 55 , 56 ] and model complexity [ 57 ]. Biological computations which are critical to cognitive functions are usually excellent candidates for incorporation into artificial systems and the neuroscience studies can provide validation of existing AI techniques for its plausibility as an integral component of an overall general intelligence system [ 58 ].…”
Section: Discussionmentioning
confidence: 99%
“…RP-AC [16] is a revamped reward-punishment Actor-Critic framework that formulates a policy gradient for continuous control. Split-Q Learning [17] suggested a more generalized reward-punishment framework by parameterizing immediate positive and negative rewards and their approximated state-action-values, aligning with various neurological and psychiatric mechanisms.…”
Section: A Separating Reward and Punishmentmentioning
confidence: 99%