“…In contrast, our proposed framework is essentially learning-based without the need for explicit 3D reconstruction [69,77,28]. Learning-based NVS emerges with the development of convolutional neural networks (CNN) [80,52,43,55,47,46,21,48,42]. Early attempts directly map an input image to a paired target image with an encoder-decoder structure [11].…”