In many domestic and military applications, aerial vehicle detection and super-resolution algorithms are frequently developed and applied independently. However, aerial vehicle detection on superresolved images remains a challenging task due to the lack of discriminative information in the super-resolved images. To address this problem, we propose a Joint Super-Resolution and Vehicle Detection Network (Joint-SRVDNet) that tries to generate discriminative, high-resolution images of vehicles from low-resolution aerial images. First, aerial images are up-scaled by a factor of 4x using a Multi-scale Generative Adversarial Network (MsGAN), which has multiple intermediate outputs with increasing resolutions. Second, a detector is trained on super-resolved images that are upscaled by factor 4x using MsGAN architecture and finally, the detection loss is minimized jointly with the super-resolution loss to encourage the target detector to be sensitive to the subsequent super-resolution training. The network jointly learns hierarchical and discriminative features of targets and produces optimal super-resolution results. We perform both quantitative and qualitative evaluation of our proposed network on VEDAI, xView and DOTA datasets. The experimental results show that our proposed framework achieves better visual quality than the state-of-the-art methods for aerial super-resolution with 4x up-scaling factor and improves the accuracy of aerial vehicle detection.
In many domestic and military applications, aerial vehicle detection and super-resolution algorithms are frequently developed and applied independently. However, aerial vehicle detection on superresolved images remains a challenging task due to the lack of discriminative information in the superresolved images. To address this problem, we propose a Joint Super-Resolution and Vehicle Detection Network (Joint-SRVDNet) that tries to generate discriminative, high-resolution images of vehicles from low-resolution aerial images. First, aerial images are up-scaled by a factor of 4x using a Multi-scale Generative Adversarial Network (MsGAN), which has multiple intermediate outputs with increasing resolutions. Second, a detector is trained on super-resolved images that are upscaled by factor 4x using MsGAN architecture and finally, the detection loss is minimized jointly with the super-resolution loss to encourage the target detector to be sensitive to the subsequent super-resolution training. The network jointly learns hierarchical and discriminative features of targets and produces optimal super-resolution results. We perform both quantitative and qualitative evaluation of our proposed network on VEDAI, xView and DOTA datasets. The experimental results show that our proposed framework achieves better visual quality than the state-of-the-art methods for aerial super-resolution with 4x up-scaling factor and improves the accuracy of aerial vehicle detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.