Due to the various shapes, textures, and colors of fires, forest fire detection is a challenging task. The traditional image processing method relies heavily on manmade features, which is not universally applicable to all forest scenarios. In order to solve this problem, the deep learning technology is applied to learn and extract features of forest fires adaptively. However, the limited learning and perception ability of individual learners is not sufficient to make them perform well in complex tasks. Furthermore, learners tend to focus too much on local information, namely ground truth, but ignore global information, which may lead to false positives. In this paper, a novel ensemble learning method is proposed to detect forest fires in different scenarios. Firstly, two individual learners Yolov5 and EfficientDet are integrated to accomplish fire detection process. Secondly, another individual learner EfficientNet is responsible for learning global information to avoid false positives. Finally, detection results are made based on the decisions of three learners. Experiments on our dataset show that the proposed method improves detection performance by 2.5% to 10.9%, and decreases false positives by 51.3%, without any extra latency.
Forest fires have the characteristics of strong unpredictability and extreme destruction. Hence, it is difficult to carry out effective prevention and control. Once the fire spreads, devastating damage will be caused to natural resources and the ecological environment. In order to detect early forest fires in real-time and provide firefighting assistance, we propose a vision-based detection and spatial localization scheme and develop a system carried on the unmanned aerial vehicle (UAV) with an OAK-D camera. During the high incidence of forest fires, UAVs equipped with our system are deployed to patrol the forest. Our scheme includes two key aspects. First, the lightweight model, NanoDet, is applied as a detector to identify and locate fires in the vision field. Techniques such as the cosine learning rate strategy and data augmentations are employed to further enhance mean average precision (mAP). After capturing 2D images with fires from the detector, the binocular stereo vision is applied to calculate the depth map, where the HSV-Mask filter and non-zero mean method are proposed to eliminate the interference values when calculating the depth of the fire area. Second, to get the latitude, longitude, and altitude (LLA) coordinates of the fire area, coordinate frame conversion is used along with data from the GPS module and inertial measurement unit (IMU) module. As a result, we experiment with simulated fire in a forest area to test the effectiveness of this system. The results show that 89.34% of the suspicious frames with flame targets are detected and the localization error of latitude and longitude is in the order of 10−5 degrees; this demonstrates that the system meets our precision requirements and is sufficient for forest fire inspection.
Deep learning-based forest fire vision monitoring methods have developed rapidly and are becoming mainstream. The existing methods, however, are based on enormous amounts of data, and have issues with weak feature extraction, poor small target recognition and many missed and false detections in complex forest scenes. In order to solve these problems, we proposed a multi-task learning-based forest fire detection model (MTL-FFDet), which contains three tasks (the detection task, the segmentation task and the classification task) and shares the feature extraction module. In addition, to improve detection accuracy and decrease missed and false detections, we proposed the joint multi-task non-maximum suppression (NMS) processing algorithm that fully utilizes the advantages of each task. Furthermore, considering the objective fact that divided flame targets in an image are still flame targets, our proposed data augmentation strategy of a diagonal swap of random origin is a good remedy for the poor detection effect caused by small fire targets. Experiments showed that our model outperforms YOLOv5-s in terms of mAP (mean average precision) by 3.2%, APs (average precision for small objects) by 4.8%, ARs (average recall for small objects) by 4.0%, and other metrics by 1% to 2%. Finally, the visualization analysis showed that our multi-task model can focus on the target region better than the single-task model during feature extraction, with superior extraction ability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.