Multi-object tracking (MOT) is an essential task in the computer vision field. With the fast development of deep learning technology in recent years, MOT has achieved great improvement. However, some challenges still remain, such as sensitiveness to occlusion, instability under different lighting conditions, non-robustness to deformable objects, etc. To address such common challenges in most of the existing trackers, in this paper, a tracklet booster algorithm is proposed, which can be built upon any other tracker. The motivation is simple and straightforward: split tracklets on potential ID-switch positions and then connect multiple tracklets into one if they are from the same object. In other words, the tracklet booster consists of two parts, i.e., Splitter and Connector. First, an architecture with stacked temporal dilated convolution blocks is employed for the splitting position prediction via label smoothing strategy with adaptive Gaussian kernels. Then, a multi-head self-attention based encoder is exploited for the tracklet embedding, which is further used to connect tracklets into larger groups. We conduct sufficient experiments on MOT17 and MOT20 benchmark datasets, which demonstrates promising results. Combined with the proposed tracklet booster, existing trackers usually can achieve large improvements on the IDF1 score, which shows the effectiveness of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.