Inferring transportation mode of users in a network is of paramount importance in planning, designing, and operating intelligent transportation systems. Previous studies in the literature have mainly utilized GPS data. However, albeit the successful performances of models built upon such data, being limited to certain participants and the requirement of their involvement makes large scale implementations difficult. Due to their ubiquitous and pervasive nature, Wi-Fi networks have the potential to collect large scale, low-cost, passive and disaggregate data on multimodal transportation. In this study, by a passive collection of Wi-Fi network data on a congested urban road in downtown Toronto, we attempt to tackle the aforementioned problems. We develop a semi-supervised deep residual network (ResNet) framework to utilize Wi-Fi communications obtained from smartphones. Our semi-supervised framework enables utilization of an ample amount of easily collected low-cost unlabelled data, coupled with a relatively small-sized labelled data. By incorporating a ResNet architecture as the core of the framework, we take advantage of the high-level features not considered in the traditional machine learning frameworks. The proposed framework shows a promising performance on the collected data, with a prediction precision of 81.4% for walking, 80.5% for biking and 84.9% for the driving mode.