We consider the problem of multi-user spectrum access in wireless networks. The bandwidth is divided into K orthogonal channels, and M users aim to access the spectrum. Each user chooses a single channel for transmission at each time slot. The state of each channel is modeled by a restless unknown Markovian process. Previous studies have analyzed a special case of this setting, in which each channel yields the same expected rate for all users. By contrast, we consider a more general and practical model, where each channel yields a different expected rate for each user. This model adds a significant challenge of how to efficiently learn a channel allocation in a distributed manner to yield a global system-wide objective. We adopt the stable matching utility as the system objective, which is known to yield strong performance in multichannel wireless networks, and develop a novel Distributed Stable Strategy Learning (DSSL) algorithm to achieve the objective. We prove theoretically that DSSL converges to the stable matching allocation, and the regret, defined as the loss in total rate with respect to the stable matching solution, has a logarithmic order with time. Finally, simulation results demonstrate the strong performance of the DSSL algorithm.
I. INTRODUCTIONWe consider the spectrum access problem, where a shared bandwidth is divided into K orthogonal channels (i.e., subbands), and M users want to access the spectrum, where K ≥ M . Each channel is modeled by a Finite-State Markovian Channel (FSMC), which is independent and non-identically distributed across channels. The FSMC is a tractable model widely used to capture the time-varying behavior of a radio communication channel [2], [3]. It is often employed to model radio channel dynamics due to primary user occupancy effects in hierarchical cognitive radio networks (where the M secondary (unlicensed) users are cognitive in terms of learning and adapting good access strategies), or the external interference effects in the open sharing model among M users in the wireless network (e.g., ISM band) [4], [5]. At each time step, each user experiences a different transmission rate over each channel depending on its FSMC distribution, where the FSMC parameters (i.e., the transition probabilities that govern the Markov chain) are unknown. At each time step, each user is allowed to choose one channel to access, and observe the instantaneous channel state. If two users or more access the same channel at the same time, a collision occurs and the achievable rate is zero.We adopt the stable matching utility (see Section II for details) as the system objective, which is known to yield strong Tomer Gafni and Kobi Cohen are with the School of Electrical and Computer Engineering,