2012
DOI: 10.1007/978-3-642-28499-1_5
|View full text |Cite
|
Sign up to set email alerts
|

Leveraging Domain Knowledge to Learn Normative Behavior: A Bayesian Approach

Abstract: Abstract. This paper addresses the problem of norm adaptation using Bayesian reinforcement learning. We are concerned with the effectiveness of adding prior domain knowledge when facing environments with different settings as well as with the speed of adapting to a new environment. Individuals develop their normative framework via interaction with their surrounding environment (including other individuals). An agent acquires the domain-dependent knowledge in a certain environment and later reuses them in diffe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…One such approach proposes a two-level framework to learn the optimal state-action using Bayesian RL and update the learned optimal state-action. The norm-detection aspect is based on Bayesian dynamic programming, where the norm salience is represented in the form of state transitions (probability over belief states) (Hosseini & Ulieru, 2012).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…One such approach proposes a two-level framework to learn the optimal state-action using Bayesian RL and update the learned optimal state-action. The norm-detection aspect is based on Bayesian dynamic programming, where the norm salience is represented in the form of state transitions (probability over belief states) (Hosseini & Ulieru, 2012).…”
Section: Related Workmentioning
confidence: 99%
“…RL is suitably used for learning through trial and error by mapping situations that lead to the discovery of actions that gain the most reward (exploration) and executing actions to maximize a numerical reward signal (exploitation) (Sutton & Barto, 2018). Within the context of NorMAS, RL is a method to invoke convention emergence or norm emergence (Frantz et al, 2014(Frantz et al, , 2015Hosseini & Ulieru, 2012;Mashayekhi et al, 2022;Neufeld et al, 2021;Pujol et al, 2005;Riveret et al, 2014aRiveret et al, , 2014bSen & Airiau, 2007;Shoham & Tennenholtz, 1992, 1997Sugawara, 2014;Yu et al, 2013Yu et al, , 2014Yu et al, , 2015Yu et al, , 2017. The current de-facto standard algorithm used in past studies to induce norm emergence using RL is QL (Sutton & Barto, 2018;Watkins & Dayan, 1992), a model-free RL algorithm, in the case of NorMAS through social learning (learning from interactions with other agents).…”
Section: Norm Emergencementioning
confidence: 99%