“…Using geometric view synthesis as a learning objective, this approach has been successfully applied to a wide range of key robot vision tasks in the challenging monocular setting, including the estimation of ego-motion [1], [2], the 6 degree-of-freedom camera translation and rotation; depth [3], [4], [5], [6], the perpixel distance value from the image plane; optical flow [7], [8], [9], [10], [11], the 2D pixel displacement between frames; and scene flow [12], [13], [14], [15], the 3D motion of each point in the scene. Although these tasks are clearly related [12], [13], [16], they typically require stereo pairs at training time [14], [15] to resolve reprojection ambiguities in self-supervision.…”