“…We first review 11 widely used datasets for MOT, including KITTI [177], [211], [212], MOT15 [213], DukeMTMCT [215] 2018 -153.6K Pose VisDrone [223] 2018 -1.8M 2D box BDD100K [219] 2018 160K 4M 2D box/Mask MOTS [37] 2019 228 26K Mask KITTI MOTS [37] 2019 --Mask CityFlow [20] 2019 666 230K 2D box MOT20 [144] 2020 3,833 2.1M 2D box Waymo [218] 2020 -12.6M 2D box/3D box nuScenes [217] 2020 -1.4M 3D box [117], MOT16-17 [143], PathTrack [124], UA-DETRAC [214], PoseTrack [215], [216], MOTS [37], CityFlow [20], KITTI MOTS [37], MOT20 [32], [144], nuScenes [217], Waymo [218], BDD100K [219], [220], and VisDrone [221], [222], [223], [224]. These datasets mainly focus on the person and vehicle tracking.…”