“…MAMP outperforms existing self-supervised methods, and is on par with some supervised methods trained with large amounts of annotated data. Notation: Video Colorization [34], RPM-Net [11], CycleTime [38], CorrFlow [13], MuG [19], UVC [14], MAST [12], OSVOS [1], RANet [39], OSVOS-S [21], GC [16], OSMN [42], SiamMask [37], OnAVOS [33], FEELVOS [32], AFB-URR [17], PReMVOS [20], STM [24], KMN [28], CFBI [43] Semi-supervised video object segmentation techniques fall into two categories: supervised and self-supervised. Supervised approaches [24,43] use the rich annotation information in training data to learn the model achieving great success in video object segmentation.…”