ObjectiveRecent deep learning techniques hold promise to enable IMU-driven gait assessment; however, they require large extents of marker-based motion capture and ground reaction force (GRF) data to serve as labels for supervised model training. We thus propose a self-supervised learning (SSL) framework to leverage large IMU datasets for pre-training deep learning models, which can improve the accuracy and data efficiency of IMU-based vertical GRF (vGRF) estimation.MethodsTo pre-train the models, we performed SSL by masking a random portion of the input IMU data and training a transformer model to reconstruct the masked portion. We systematically compared a series of masking ratios across three pre-training datasets that included real IMU data, synthetic IMU data, or a combination of the two. Finally, we built models that used pre-training and labeled data to estimate vGRF during three prediction tasks: overground walking, treadmill walking, and drop landing.ResultsWhen using the same amount of labeled data, SSL pre-training improved the accuracy of vGRF estimation during walking compared to baseline models trained by conventional supervised learning. The correlation coefficients for vGRF estimation improved from 0.92 to 0.95 for overground waking and from 0.94 to 0.97 for treadmill walking. Also, using 1–10% of walking data to fine-tune pre-trained models yielded comparable accuracy to the baseline model that was trained on 100% of walking data.ConclusionThe proposed SSL framework leveraged large real and synthetic IMU datasets to increase the accuracy and data efficiency of deep-learning-based vGRF estimation, reducing the need of labels.SignificanceThis work may unlock broader use cases of IMU-driven assessment where only small labeled datasets are available.