Abstract:Rapid detection of illicit opium poppy plants using UAV (unmanned aerial vehicle) imagery has become an important means to prevent and combat crimes related to drug cultivation. However, current methods rely on time-consuming visual image interpretation. Here, the You Only Look Once version 3 (YOLOv3) network structure was used to assess the influence that different backbone networks have on the average precision and detection speed of an UAV-derived dataset of poppy imagery, with MobileNetv2 (MN) selected as … Show more
“…At this stage, we use Long Short Term Memory (LSTM) convolutional block ( Xu et al, 2020 ; Li et al, 2020 ), which is tasked to extract 32 most useful features in the entire sequence. For our main neural network backbone, we use MobileNetV2 , which is the extension of MobileNet , for it has show to achieve great results in predictive capabilities ( Howard et al, 2017 ; Zhou et al, 2019 ), however, the architecture itself is relatively light-weight for it is designed to be used in low power devices such as mobile devices, unlike for example, YOLOV3 , which while having impressive recall results ( Redmon & Farhadi, 2018 ), is much more complex and has a substantially poorer performance. The MobileNetV2 output is then connected to a global average pooling layer in order to reduce dimensionality and improve generalization rate ( Zhou et al, 2016 ).…”
Human posture detection allows the capture of the kinematic parameters of the human body, which is important for many applications, such as assisted living, healthcare, physical exercising and rehabilitation. This task can greatly benefit from recent development in deep learning and computer vision. In this paper, we propose a novel deep recurrent hierarchical network (DRHN) model based on MobileNetV2 that allows for greater flexibility by reducing or eliminating posture detection problems related to a limited visibility human torso in the frame, i.e., the occlusion problem. The DRHN network accepts the RGB-Depth frame sequences and produces a representation of semantically related posture states. We achieved 91.47% accuracy at 10 fps rate for sitting posture recognition.
“…At this stage, we use Long Short Term Memory (LSTM) convolutional block ( Xu et al, 2020 ; Li et al, 2020 ), which is tasked to extract 32 most useful features in the entire sequence. For our main neural network backbone, we use MobileNetV2 , which is the extension of MobileNet , for it has show to achieve great results in predictive capabilities ( Howard et al, 2017 ; Zhou et al, 2019 ), however, the architecture itself is relatively light-weight for it is designed to be used in low power devices such as mobile devices, unlike for example, YOLOV3 , which while having impressive recall results ( Redmon & Farhadi, 2018 ), is much more complex and has a substantially poorer performance. The MobileNetV2 output is then connected to a global average pooling layer in order to reduce dimensionality and improve generalization rate ( Zhou et al, 2016 ).…”
Human posture detection allows the capture of the kinematic parameters of the human body, which is important for many applications, such as assisted living, healthcare, physical exercising and rehabilitation. This task can greatly benefit from recent development in deep learning and computer vision. In this paper, we propose a novel deep recurrent hierarchical network (DRHN) model based on MobileNetV2 that allows for greater flexibility by reducing or eliminating posture detection problems related to a limited visibility human torso in the frame, i.e., the occlusion problem. The DRHN network accepts the RGB-Depth frame sequences and produces a representation of semantically related posture states. We achieved 91.47% accuracy at 10 fps rate for sitting posture recognition.
“…Other works linking thermography and UAV are focused on diverse applications, like crop management [29,30], power line [31,32] monitoring, and solar power plants inspection [33][34][35]. The works of [36,37] are examples of applications that use drones and machine learning algorithms to aid inspection processes. Next, we present the details and components of our solution.…”
“…You Only Look Once (YOLO) [51] is a CNN-based method for object detection. It presents excellent results in terms of precision and is widely used in inspection scenarios [36,37], but requires a lot of computing power for real-time execution. The Aggregated Channel Features (ACF) method [52], which can be view as an evolution of the classical method of Boosted Cascade of Simple Features proposed by Viola-Jones [47], is another important machine learning technique for object detection.…”
Frequent and accurate inspections of industrial components and equipment are essential because failures can cause unscheduled downtimes, massive material, and financial losses or even endanger workers. In the mining industry, belt idlers or rollers are examples of such critical components. Although there are many precise laboratory techniques to assess the condition of a roller, companies still have trouble implementing a reliable and scalable procedure to inspect their field assets. This article enumerates and discusses the existing roller inspection techniques and presents a novel approach based on an Unmanned Aerial Vehicle (UAV) integrated with a thermal imaging camera. Our preliminary results indicate that using a signal processing technique, we are able to identify roller failures automatically. We also proposed and implemented a back-end platform that enables field and cloud connectivity with enterprise systems. Finally, we have also cataloged the anomalies detected during the extensive field tests in order to build a structured dataset that will allow for future experimentation.
“…This work focuses more on special object detection for UAVs. Zhou et al [14] used an updated YOLO v3 model to detect the opium poppies in images captured by an UAV. Compared with original YOLO v3 model the model uses the recently proposed Generalized Intersection Over Union (GIOU as the loss function, and a Spatial Pyramid Pooling Unit is added [21], while, the method is run on a RTX2080Ti platform, which means the detection process is offline and could not benefit the automatic control immediately for UAVs in an unknown environment.…”
Section: Detection Ability Of Uavsmentioning
confidence: 99%
“…We address the understanding method with this property as onlined in this paper. In contrast, the understanding method to extract the information is offline if the UAV has already finished the aerial mission and the processing result has no influence on the mission (e.g., filming an area [14]). The online method to extract the information from real-time images has wider application prospects than the offline one.…”
What makes unmanned aerial vehicles (UAVs) intelligent is their capability of sensing and understanding new unknown environments. Some studies utilize computer vision algorithms like Visual Simultaneous Localization and Mapping (VSLAM) and Visual Odometry (VO) to sense the environment for pose estimation, obstacles avoidance and visual servoing. However, understanding the new environment (i.e., make the UAV recognize generic objects) is still an essential scientific problem that lacks a solution. Therefore, this paper takes a step to understand the items in an unknown environment. The aim of this research is to enable the UAV with basic understanding capability for a high-level UAV flock application in the future. Specially, firstly, the proposed understanding method combines machine learning and traditional algorithm to understand the unknown environment through RGB images; secondly, the You Only Look Once (YOLO) object detection system is integrated (based on TensorFlow) in a smartphone to perceive the position and category of 80 classes of objects in the images; thirdly, the method makes the UAV more intelligent and liberates the operator from labor; fourthly, detection accuracy and latency in working condition are quantitatively evaluated, and properties of generality (can be used in various platforms), transportability (easily deployed from one platform to another) and scalability (easily updated and maintained) for UAV flocks are qualitatively discussed. The experiments suggest that the method has enough accuracy to recognize various objects with high computational speed, and excellent properties of generality, transportability and scalability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.