Adapting behaviors through a learning process

Jean-Marie, Alain; Tidball, Mabel

doi:10.1016/j.jebo.2004.02.007

Cited by 26 publications

(17 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We first study the convergence speed of the expected rewards defined in Equation (8). The result is depicted in Fig.…”

Section: Numerical Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Power entangling and matching in cognitive wireless mesh networks by applying conjecture based multi-agent QQ-learning approach

Chen¹,

Zhao²,

Zhang³

2010

2010 IEEE Globecom Workshops

View full text Add to dashboard Cite

As the scarce spectrum resource is becoming overcrowded, cognitive wireless mesh networks express great flexibility to improve the spectrum utilization by opportunistically accessing the authorized frequency bands. One of the critical challenges for realizing such networks is how to adaptively match transmit powers and allocate frequency resources among secondary users (SUs) of the licensed frequency bands whilst maintaining the Quality-of-Service (QoS) requirement of the primary users (PUs), even in mutually entangled interference environment. In this paper, we discuss the non-cooperative power allocation matching problem in cognitive wireless mesh networks formed by a number of clusters with the consideration of energy efficiency. Due to the secondary users' selfish and spontaneous features, the problem is modeled as a stochastic learning process. We extend the conventional single-agent Qlearning to a multi-user context, coined as QQ-learning, using the framework of stochastic games. Within the multi-agent QQlearning processes, a learning SU performs Q-function updates based on the conjecture about the other SUs' behaviors. This learning algorithm provably converges given certain restrictions that arise during learning procedure. Numerical experiments are used to verify the performance of our algorithm and demonstrate its effectiveness of improving the energy efficiency.

show abstract

“…We first study the convergence speed of the expected rewards defined in Equation (8). The result is depicted in Fig.…”

Section: Numerical Resultsmentioning

confidence: 99%

“…where the so-called reference points [8], c i and π i (a i ), are specific conjecture and probability, and ω i is a positive scalar. We propose a simple rule for the SUs to configure their reference points.…”

Section: A Conjecture Based Multi-agent Qq-learning Approachmentioning

confidence: 99%

Power entangling and matching in cognitive wireless mesh networks by applying conjecture based multi-agent QQ-learning approach

Chen¹,

Zhao²,

Zhang³

2010

2010 IEEE Globecom Workshops

View full text Add to dashboard Cite

show abstract

“…In the cases when both the strategy and the local expected payoff are to be learned, the AC-like, multiple-timescale learning algorithms [73] provide an efficient strategy-learning approach (e.g., stochastic FP [33]) for the agents. Further, when the joint action or the payoff of the adversary agents is not directly observable, conjecture-variation-based learning [74] works as an alternative way of the aforementioned learning algorithms. In the literature, these joint policy-value-iteration mechanisms for games are also known as the COmbined fully DIstributed PAyoff and Strategy-Reinforcement Learning (CODIPAS-RL) mechanisms [37].…”

Section: Multi-agent Strategy Learning In the Context Of Gamesmentioning

confidence: 99%

A Survey on Applications of Model-Free Strategy Learning in Cognitive Wireless Networks

Wang

Kwasinski

Niyato

et al. 2016

IEEE Commun. Surv. Tutorials

View full text Add to dashboard Cite

Abstract-The framework of cognitive wireless radio is expected to endow the wireless devices with the cognitionintelligence ability, with which they can efficiently learn and respond to the dynamic wireless environment. In many practical scenarios, the complexity of network dynamics makes it difficult to determine the network evolution model in advance. As a result, the wireless decision-making entities may face a black-box network control problem and the model-based network management mechanisms will be no longer applicable. In contrast, model-free learning has been considered as an efficient tool for designing control mechanisms when the model of the system environment or the interaction between the decision-making entities is not available as a-priori knowledge. With model-free learning, the decision-making entities adapt their behaviors based on the reinforcement from their interaction with the environment and are able to (implicitly) build the understanding of the system through trial-and-error mechanisms. Such characteristics of model-free learning is highly in accordance with the requirement of cognition-based intelligence for devices in cognitive wireless networks. Recently, model-free learning has been considered as one key implementation approach to adaptive, self-organized network control in cognitive wireless networks. In this paper, we provide a comprehensive survey on the applications of the stateof-the-art model-free learning mechanisms in cognitive wireless networks. According to the system models that those applications are based on, a systematic overview of the learning algorithms in the domains of single-agent system, multi-agent systems and multi-player games is provided. Furthermore, the applications of model-free learning to various problems in cognitive wireless networks are discussed with the focus on how the learning mechanisms help to provide the solutions to these problems and improve the network performance over the existing model-based, non-adaptive methods. Finally, a broad spectrum of challenges and open issues is discussed to offer a guideline for the future research directions.

show abstract

“…Herein, is the belief factor, and and are called the reference points [10]. The belief functions deployed by the SUs are based on the concept of reciprocity, which refers to the interaction mechanism that if the SUs realize the probabilities of interacting with each other in the future is high, they will consider their in uence on other SUs' strategies.…”

Section: A the Belief Functionmentioning

confidence: 99%

Reciprocity inspired learning for opportunistic spectrum access in cognitive radio networks

Chen

Cheng

et al. 2013

8th International Conference on Cognitive Radio Oriented Wireless Networks

View full text Add to dashboard Cite

This paper addresses opportunistic spectrum access (OSA) in non-cooperative cognitive radio networks (CRNs). The sel sh behaviors of the secondary users (SUs) will cause a CRN to collapse. The SUs are thus enabled to build beliefs about how other SUs would respond to their decision makings. The interaction among the SUs is modeled as a stochastic learning process. In this way, each SU can independently learn the behaviors of the competitors, optimize the OSA strategies, and nally achieve the goal of reciprocity. Two learning algorithms are proposed to stabilize the stochastic CRNs, the convergence properties of which are also proven theoretically. Simulation results validate the performance of the proposed results, and show that the achieved system performance outperforms some existing protocols.

show abstract

Adapting behaviors through a learning process

Cited by 26 publications

References 12 publications

Power entangling and matching in cognitive wireless mesh networks by applying conjecture based multi-agent QQ-learning approach

Power entangling and matching in cognitive wireless mesh networks by applying conjecture based multi-agent QQ-learning approach

A Survey on Applications of Model-Free Strategy Learning in Cognitive Wireless Networks

Reciprocity inspired learning for opportunistic spectrum access in cognitive radio networks

Contact Info

Product

Resources

About