“…To enhance compatibility with the DPU and optimize detection efficiency, we introduce modifications to the original YOLOv5 network: all activation functions now utilize Sigmoid [12], and the pooling kernel of the SPP [13] structure is set to 3×3, 5×5, 7×7. Specific training settings include a single target category (Fire), 9 anchors [14] (10,13,16,30,33,23,30,61,62,45,59,119,116,90,156,198,373,326), and an input image resolution of 416×416. We employ mosaic data augmentation [15] during training, with 24 frozen training iterations and a total of 48 iterations.…”