“…Both these approaches rely on assimilating information via their pixel-connectivity to improve feature representations. For scale relations, many efforts have been made on fusing features across scales to alleviate the discrepancy of feature maps from different levels of bottom-up hierarchy and feature scale-space, including top-down information flow [15, 40, 54], an extra bottom-up information path [31,43,68], multiple hourglass structures [46,81], concatenating features from different layers [4,20,38,59] or different tasks [52], gradual multi-stage local information fusions [58,75], pyramid convolutions [67], etc. Even though standard design principles for scale relations are emerging for ConvNet architectures, the problem is far from being solved.…”