“…The pyramid structure is a hierarchical structure that can preserve multi-scale features including both the local details and the global arrangement information. For pixel-wise prediction tasks such as semantic segmation [ 46 , 47 , 48 , 49 , 50 , 51 ], object detection [ 52 , 53 , 54 , 55 , 56 , 57 ] and depth map estimation [ 20 , 21 , 58 , 59 , 60 , 61 ], fusing features in different types and scales can be very helpful for DNN models to learn more fruitful information and then gain better performance.…”