Cross-modal human pose estimation has a wide range of applications. Traditional image-based pose estimation will not work well in poor light or darkness. Therefore, some sensors such as LiDAR or Radio Frequency (RF) signals are now using to estimate human pose. However, it limits the application that these methods require much high-priced professional equipment. To address these challenges, we propose a new WiFi-based pose estimation method. Based on the Channel State Information (CSI) of WiFi, a novel architecture CSI-former is proposed to innovatively realize the integration of the multi-head attention in the WiFi-based pose estimation network. To evaluate the performance of CSI-former, we establish a span-new dataset Wi-Pose. This dataset consists of 5 GHz WiFi CSI, the corresponding images, and skeleton point annotations. The experimental results on Wi-Pose demonstrate that CSI-former can significantly improve the performance in wireless pose estimation and achieve more remarkable performance over traditional image-based pose estimation. To better benefit future research on the WiFi-based pose estimation, Wi-Pose has been made publicly available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.