Vision-aided wireless communication is motivated by the recent advances in deep learning and computer vision as well as the increasing dependence on line-of-sight links in millimeter wave (mmWave) and terahertz systems. By leveraging vision, this new research direction enables an interesting set of new capabilities such as vision-aided mmWave beam and blockage prediction, proactive hand-off, and resource allocation among others. These capabilities have the potential of reliably supporting highly-mobile applications such as vehicular/drone communications and wireless virtual/augmented reality in mmWave and terahertz systems. Investigating these interesting applications, however, requires the development of special dataset and machine learning tasks. Based on the Vision-Wireless (ViWi) dataset generation framework [1], this paper develops an advanced and realistic scenario/dataset that features multiple base stations, mobile users, and rich dynamics. Enabled by this dataset, the paper defines the vision-wireless mmWave beam tracking task (ViWi-BT) and proposes a baseline solution that can provide an initial benchmark for the future ViWi-BT algorithms.
This paper investigates a novel research direction that leverages vision to help overcome the critical wireless communication challenges. In particular, this paper considers millimeter wave (mmWave) communication systems, which are principal components of 5G and beyond. These systems face two important challenges: (i) the large training overhead associated with selecting the optimal beam and (ii) the reliability challenge due to the high sensitivity to link blockages. Interestingly, most of the devices that employ mmWave arrays will likely also use cameras, such as 5G phones, self-driving vehicles, and virtual/augmented reality headsets. Therefore, we investigate the potential gains of employing cameras at the mmWave base stations and leveraging their visual data to help overcome the beam selection and blockage prediction challenges. To do that, this paper exploits computer vision and deep learning tools to predict mmWave beams and blockages directly from the camera RGB images and the sub-6GHz channels. The experimental results reveal interesting insights into the effectiveness of such solutions. For example, the deep learning model is capable of achieving over 90% beam prediction accuracy, which only requires snapping a shot of the scene and zero overhead.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.