Abstract:In this paper, we propose a novel data augmentation technique (ANDA) applied to the Salient Object Detection (SOD) context. Standard data augmentation techniques proposed in the literature, such as image cropping, rotation, flipping, and resizing, only generate variations of the existing examples, providing a limited generalization. Our method has the novelty of creating new images, by combining an object with a new background while retaining part of its salience in this new context; To do so, the ANDA techniq… Show more
“…Additionally, our proposed framework ease the addition of other modules such as image processing, classification, object detection, semantic segmentation, and some others novel deep learning methods that explore domain adaptation and data generation that can run on the remote server and make use of Hardware-accelerated Deep Neural Networks running on GPU [30], [31], [32], [33].…”
Civilian Unmanned Aerial Vehicles (UAVs) are becoming more accessible for domestic use. Currently, UAV manufacturer DJI dominates the market, and their drones have been used for a wide range of applications. Model lines such as the Phantom can be applied for autonomous navigation where Global Positioning System (GPS) signal are not reliable, with the aid of Simultaneous Localization and Mapping (SLAM), such as monocular Visual SLAM. In this work, we propose a bridge among different systems, such as Linux, Robot Operating System (ROS), Android, and UAVs as an open source framework, where the gimbal camera recording can be streamed to a remote server, supporting the implementation of an autopilot. Finally, we present some experimental results showing the performance of the video streaming validating the framework.
“…Additionally, our proposed framework ease the addition of other modules such as image processing, classification, object detection, semantic segmentation, and some others novel deep learning methods that explore domain adaptation and data generation that can run on the remote server and make use of Hardware-accelerated Deep Neural Networks running on GPU [30], [31], [32], [33].…”
Civilian Unmanned Aerial Vehicles (UAVs) are becoming more accessible for domestic use. Currently, UAV manufacturer DJI dominates the market, and their drones have been used for a wide range of applications. Model lines such as the Phantom can be applied for autonomous navigation where Global Positioning System (GPS) signal are not reliable, with the aid of Simultaneous Localization and Mapping (SLAM), such as monocular Visual SLAM. In this work, we propose a bridge among different systems, such as Linux, Robot Operating System (ROS), Android, and UAVs as an open source framework, where the gimbal camera recording can be streamed to a remote server, supporting the implementation of an autopilot. Finally, we present some experimental results showing the performance of the video streaming validating the framework.
“…In addition, we plan to ship our solution in the field, carefully defining the best hardware in terms of cost-benefit and also the best position of each camera in order to avoid shadows, reflections and vandalism. Finally, we want to explore more advanced data augmentation techniques (e.g., those presented in [25,26]) in order to achieve even better results without having to manually label more thousands of images for training our system.…”
In this work, we present a robust and efficient solution for counting and identifying train wagons using computer vision and deep learning. The proposed solution is cost-effective and can easily replace solutions based on radiofrequency identification (RFID), which are known to have high installation and maintenance costs. According to our experiments, our two-stage methodology achieves impressive results on real-world scenarios, i.e., 100% accuracy in the counting stage and 99.7% recognition rate in the identification one. Moreover, the system is able to automatically reject some of the train wagons successfully counted, as they have damaged identification codes. The results achieved were surprising considering that the proposed system requires low processing power (i.e., it can run in low-end setups) and that we used a relatively small number of images to train our Convolutional Neural Network (CNN) for character recognition. The proposed method is registered, under number BR512020000808-9, with the National Institute of Industrial Property (Brazil).An article about the proposed system has been published in the October 2020 issue of Railway Gazette International [1], the leading business journal for the worldwide rail industry.
“…To verify the effectiveness of our data augmentation method, five methods are compared with ours. These methods are without data augmentation, H-Flip [14], ANDA [49], IDA [7], and GridMask [31]. In addition to the data augmentation method changes, the other parts of the network are unchanged.…”
Section: Compared With Recent Data Augmentation Methodsmentioning
Most self-distillation methods need complex auxiliary teacher structures and require lots of training samples in object segmentation task. To solve this challenging, a selfdistillation object segmentation method via frequency domain knowledge augmentation is proposed. Firstly, an object segmentation network which efficiently integrates multilevel features is constructed. Secondly, a pixel-wise virtual teacher generation model is proposed to drive the transferring of pixel-wise knowledge to the object segmentation network through self-distillation learning, so as to improve its generalisation ability. Finally, a frequency domain knowledge adaptive generation method is proposed to augment data, which utilise differentiable quantisation operator to adjust the learnable pixel-wise quantisation table dynamically. What's more, we reveal convolutional neural network is more inclined to learn low-frequency information during the train. Experiments on five object segmentation datasets show that the proposed method can enhance the performance of the object segmentation network effectively. The boosting performance of our method is better than recent self-distillation methods, and the average F β and mIoU are increased by about 1.5% and 3.6% compared with typical feature refinement self-distillation method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.