Despite the empirical success of knowledge distillation, there still lacks a theoretical foundation that can naturally lead to computationally inexpensive implementations. To address this concern, we forge an alternative connection between information theory and knowledge distillation using a recently proposed entropy-like functional. In doing so, we introduce two distinct complementary losses which aim to maximise the correlation and mutual information between the student and teacher representations. Our method achieves competitive performance to state-of-the-art on the knowledge distillation and cross-model transfer tasks, while incurring significantly less training overheads than closely related and similarly performing approaches. We further demonstrate the effectiveness of our method on a binary distillation task, whereby we shed light to a new state-of-theart for binary quantisation. The code, evaluation protocols, and trained models will be publicly available.
Over the last years, Convolutional Neural Networks (CNNs) have been widely used in remote sensing applications, such as marine surveillance, traffic management or road networks detection. However, since CNNs have extremely high computational, bandwith and memory requirements, the hardware implementation of a CNN on space-grade devices like FPGAs for the on-board processing of the acquired images has brought many challenges, since the computational capabilities of the onboard hardware devices are limited. Hence, implementations have to be carefully planned. In this paper, the authors present their work towards the implementation of an efficient CNN onto a space-grade FPGA in order to achieve the on-board processing of very-high resolution remotely sensed images as soon as the data are provided by the sensor. All this work has been conducted within the EU-funded VIDEO project. As it will be presented in this paper, the work includes the introduction of a methodology based on the project constraints, the evaluation of different state-of-the-art CNN architectures by means of a new efficiency measurement also proposed in this work, the introduction of a new efficient CNN architecture, and finally, its optimized hardware implementation by means of high-level synthesis tools. The results obtained following the proposed methodology demonstrate that the uncovered architecture is able to detect targets of interest in RGB images with a much higher efficiency than state-of-the-art solutions, while requiring a much smaller amount of computing and memory resources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.