2013
DOI: 10.1016/b978-0-444-62604-2.00023-x
|View full text |Cite
|
Sign up to set email alerts
|

Relativized hierarchical decomposition of Markov decision processes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…In the instructive feedback condition, on the other hand, participants might have felt less motivation to use a computationally demanding Bayesian updating strategy (or fewer participants might have consistently done so), because they could only rely on intrinsic reward to execute the task correctly, leading to relatively weaker performance. This notion is specifically in line with reinforcement learning theory where individuals, as biological agents, respond to environmental stimuli in ways that will result in the maximization of reward and minimization of loss (O'Hara, Hall, van Rijsbergen, & Shadbolt, 2006;Ravindran, 2013). However, we note that there are a number of different ways in which participants' behavior may have deviated from Bayes optimality, and the results of this study do not serve to fully disambiguate between these.…”
Section: T a B L E 2 Regression Analyses Of Frn Component In Monetarymentioning
confidence: 79%
“…In the instructive feedback condition, on the other hand, participants might have felt less motivation to use a computationally demanding Bayesian updating strategy (or fewer participants might have consistently done so), because they could only rely on intrinsic reward to execute the task correctly, leading to relatively weaker performance. This notion is specifically in line with reinforcement learning theory where individuals, as biological agents, respond to environmental stimuli in ways that will result in the maximization of reward and minimization of loss (O'Hara, Hall, van Rijsbergen, & Shadbolt, 2006;Ravindran, 2013). However, we note that there are a number of different ways in which participants' behavior may have deviated from Bayes optimality, and the results of this study do not serve to fully disambiguate between these.…”
Section: T a B L E 2 Regression Analyses Of Frn Component In Monetarymentioning
confidence: 79%
“…Ambiguity is characterized by an uncertain mapping between hidden states and outcomes (e.g., states that are partially observed) – and generally calls for policy selection or decisions under uncertainty; e.g. (Alagoz et al, 2010, Ravindran, 2013). In this setting, optimal behaviour depends upon beliefs about states, as opposed to states per se .…”
Section: Introductionmentioning
confidence: 99%
“…a priori knowledge of symmetry, as we do; see Ravindran and Barto (2001) and Narayanamurthy and Ravindran (2008).…”
Section: Introductionmentioning
confidence: 55%