Abstract-In this work we design and develop Montage for real-time multi-user formation tracking and localization by offthe-shelf smartphones. Montage achieves submeter-level tracking accuracy by integrating temporal and spatial constraints from user movement vector estimation and distance measuring. In Montage we designed a suite of novel techniques to surmount a variety of challenges in real-time tracking, without infrastructure and fingerprints, and without any a priori user-specific (e.g., stride-length and phone-placement) or site-specific (e.g., digitalized map) knowledge. We implemented, deployed and evaluated Montage in both outdoor and indoor environment. Our experimental results (847 traces from 15 users) show that the stridelength estimated by Montage over all users has error within 9cm, and the moving-direction estimated by Montage is within 20 o . For real-time tracking, Montage provides meter-second-level formation tracking accuracy with off-the-shelf mobile phones.