Leveraging Domain Knowledge to Learn Normative Behavior: A Bayesian Approach

Hosseini, Hadi; Ulieru, Mihaela

doi:10.1007/978-3-642-28499-1_5

Cited by 1 publication

(2 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One such approach proposes a two-level framework to learn the optimal state-action using Bayesian RL and update the learned optimal state-action. The norm-detection aspect is based on Bayesian dynamic programming, where the norm salience is represented in the form of state transitions (probability over belief states) (Hosseini & Ulieru, 2012).…”

Section: Related Workmentioning

confidence: 99%

“…RL is suitably used for learning through trial and error by mapping situations that lead to the discovery of actions that gain the most reward (exploration) and executing actions to maximize a numerical reward signal (exploitation) (Sutton & Barto, 2018). Within the context of NorMAS, RL is a method to invoke convention emergence or norm emergence (Frantz et al, 2014(Frantz et al, , 2015Hosseini & Ulieru, 2012;Mashayekhi et al, 2022;Neufeld et al, 2021;Pujol et al, 2005;Riveret et al, 2014aRiveret et al, , 2014bSen & Airiau, 2007;Shoham & Tennenholtz, 1992, 1997Sugawara, 2014;Yu et al, 2013Yu et al, , 2014Yu et al, , 2015Yu et al, , 2017. The current de-facto standard algorithm used in past studies to induce norm emergence using RL is QL (Sutton & Barto, 2018;Watkins & Dayan, 1992), a model-free RL algorithm, in the case of NorMAS through social learning (learning from interactions with other agents).…”

Section: Norm Emergencementioning

confidence: 99%

See 1 more Smart Citation

Norm Augmented Reinforcement Learning Agents With Synthesized Normative Rules

Kadir,

Selamat,

Krejcar

2024

Journal of Cases on Information Technology

View full text Add to dashboard Cite

The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.

show abstract

Section: Related Workmentioning

confidence: 99%