“…As can be observed in Section 4.3, the overwhelming majority of the state-of-the-art research for domain adaptation of object detectors in videos use self-training in one form or another [29,30,31,32,33,36,37,38,40,42,11,10]. In order to adapt a generic pedestrian detector to a specific scene, a typical system would run the generic detector on some frames in a video, then score each detection using some heuristics and afterwards, add the most confident positive and negative detections to the original dataset for retraining.…”