“…Learning-based approaches often build a deep network to describe the fusion process, and produce the target image by feeding observed images into the network [11], [14], [36]. Some approaches enhance the ability to fuse images in the network structures, such as 3D convolutional neural networks (CNN) [38], residual networks [39], multiscale structures [40], pyramid networks [41], attention networks [42], [43], crossmode information [44], dense networks [45], [46], adversarial network [47], [48]. Others use detail information from highspatial-resolution conventional images to improve performance [49]- [52], while some form a hybrid of model-and deep learning-based approaches [53]- [55], [65], [66].…”