Ambient backscatter (AB) communication is an emerging wireless communication technology that enables wireless devices (WDs) to communicate without requiring active radio transmission. In an AB communication system, a WD switches between communication and energy harvesting modes. The harvested energy is used to power the devices operations, e.g., circuit power consumption and sensing operation. In this paper, we focus on maximizing the throughput performance of AB communication system by adaptively selecting the operating mode under fading channel environment. We model the problem as an infinite-horizon Markov Decision Process (MDP) and accordingly obtain the optimal mode switching policy by the value iteration algorithm given the channel distributions. Meanwhile, when the knowledge of channel distribution is absent, a Q-learning (QL) method is applied to explore a suboptimal strategy through device repeated interaction with the environment. Finally, our simulations show that the proposed QL method can achieve closeto-optimal throughput performance and significantly outperforms the other than representative benchmark methods.Index Terms-Ambient backscatter communication, Markov decision process, reinforcement learning, Q-learning.
I. INTRODUCTIONThe future Internet of thing (IoT) technology interconnects numerous sensing devices with communications capability for a wide range of applications, e.g., remote monitoring, automatic control, diagnosis and maintenance [1]. Recently, a new communication paradigm named ambient backscatter (AB) communication is widely studied as an energy-efficient method applicable in IoT system [2]. In particular, a tag transmitter in AB communication system communicates with its receiver by backscattering its ambient radio frequency (RF) signals. Specifically, a transmitter tag transmits '0' or '1' by switching its antenna to non-reflecting or reflecting mode, respectively. Compared to the conventional backscatter communication scheme in radio frequency identification (RFID) systems, AB communication does not require a dedicated energy-emitting reader, and relies solely on external energy sources in the ambient environment, such as WiFi, public radio, and cellular transmit power. As such, the application of AB communication can effectively reduce the deployment cost of large-size IoT network, such as smart homes, smart cities, and environment monitoring [3], [4], [5].There has been tremendous research interests recently on ambient backscatter communications [6], [7]. For instance, [8] analyzed the bit error rate of an AB communication link when the receiver uses an energy detector to detect the 1bit information transmitted per channel use. [9] integrates the AB communication with conventional harvest-then-transmit (HTT) protocol in the radio frequency-powered cognitive radio networks, where the backscatter tag can choose to backscatter the ambient RF signal to the receiver or harvest energy for later active transmissions. To achieve the optimal throughput performance, the authors assume ...